Overview

Dataset statistics

Number of variables41
Number of observations59400
Missing cells46094
Missing cells (%)1.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory116.0 MiB
Average record size in memory2.0 KiB

Variable types

CAT29
NUM10
BOOL2

Reproduction

Analysis started2020-07-03 15:52:15.337954
Analysis finished2020-07-03 15:52:38.650976
Duration23.31 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

recorded_by has constant value "GeoData Consultants Ltd" Constant
date_recorded has a high cardinality: 356 distinct values High cardinality
funder has a high cardinality: 1897 distinct values High cardinality
installer has a high cardinality: 2145 distinct values High cardinality
wpt_name has a high cardinality: 37400 distinct values High cardinality
subvillage has a high cardinality: 19287 distinct values High cardinality
lga has a high cardinality: 125 distinct values High cardinality
ward has a high cardinality: 2092 distinct values High cardinality
scheme_name has a high cardinality: 2696 distinct values High cardinality
extraction_type_group is highly correlated with extraction_type and 1 other fieldsHigh correlation
extraction_type is highly correlated with extraction_type_group and 1 other fieldsHigh correlation
extraction_type_class is highly correlated with extraction_type and 1 other fieldsHigh correlation
management_group is highly correlated with managementHigh correlation
management is highly correlated with management_groupHigh correlation
payment_type is highly correlated with paymentHigh correlation
payment is highly correlated with payment_typeHigh correlation
quality_group is highly correlated with water_qualityHigh correlation
water_quality is highly correlated with quality_groupHigh correlation
quantity_group is highly correlated with quantityHigh correlation
quantity is highly correlated with quantity_groupHigh correlation
source_type is highly correlated with source and 1 other fieldsHigh correlation
source is highly correlated with source_type and 1 other fieldsHigh correlation
source_class is highly correlated with source and 1 other fieldsHigh correlation
waterpoint_type_group is highly correlated with waterpoint_typeHigh correlation
waterpoint_type is highly correlated with waterpoint_type_groupHigh correlation
funder has 3635 (6.1%) missing values Missing
installer has 3655 (6.2%) missing values Missing
public_meeting has 3334 (5.6%) missing values Missing
scheme_management has 3877 (6.5%) missing values Missing
scheme_name has 28166 (47.4%) missing values Missing
permit has 3056 (5.1%) missing values Missing
amount_tsh is highly skewed (γ1 = 57.80779995) Skewed
num_private is highly skewed (γ1 = 91.93374999) Skewed
id has unique values Unique
amount_tsh has 41639 (70.1%) zeros Zeros
gps_height has 20438 (34.4%) zeros Zeros
longitude has 1812 (3.1%) zeros Zeros
num_private has 58643 (98.7%) zeros Zeros
population has 21381 (36.0%) zeros Zeros
construction_year has 20709 (34.9%) zeros Zeros

Variables

id
Real number (ℝ≥0)

UNIQUE

Distinct count59400
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37115.131767676765
Minimum0
Maximum74247
Zeros1
Zeros (%)< 0.1%
Memory size464.2 KiB

Quantile statistics

Minimum0
5-th percentile3730.9
Q118519.75
median37061.5
Q355656.5
95-th percentile70564.05
Maximum74247
Range74247
Interquartile range (IQR)37136.75

Descriptive statistics

Standard deviation21453.12837
Coefficient of variation (CV)0.5780156866
Kurtosis-1.201515029
Mean37115.13177
Median Absolute Deviation (MAD)18568.5
Skewness0.00262253035
Sum2204638827
Variance460236716.9
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
20471< 0.1%
 
723101< 0.1%
 
498051< 0.1%
 
518521< 0.1%
 
620911< 0.1%
 
641381< 0.1%
 
579931< 0.1%
 
600401< 0.1%
 
334131< 0.1%
 
354601< 0.1%
 
456991< 0.1%
 
416011< 0.1%
 
436481< 0.1%
 
702631< 0.1%
 
682121< 0.1%
 
204421< 0.1%
 
231341< 0.1%
 
190361< 0.1%
 
292751< 0.1%
 
251771< 0.1%
 
272241< 0.1%
 
46951< 0.1%
 
67421< 0.1%
 
5971< 0.1%
 
26441< 0.1%
 
Other values (59375)59375> 99.9%
 
ValueCountFrequency (%) 
01< 0.1%
 
11< 0.1%
 
21< 0.1%
 
31< 0.1%
 
41< 0.1%
 
51< 0.1%
 
61< 0.1%
 
71< 0.1%
 
81< 0.1%
 
91< 0.1%
 
ValueCountFrequency (%) 
742471< 0.1%
 
742461< 0.1%
 
742431< 0.1%
 
742421< 0.1%
 
742401< 0.1%
 
742391< 0.1%
 
742381< 0.1%
 
742371< 0.1%
 
742361< 0.1%
 
742351< 0.1%
 

amount_tsh
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct count98
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean317.6503846801347
Minimum0.0
Maximum350000.0
Zeros41639
Zeros (%)70.1%
Memory size464.2 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q320
95-th percentile1200
Maximum350000
Range350000
Interquartile range (IQR)20

Descriptive statistics

Standard deviation2997.574558
Coefficient of variation (CV)9.436709989
Kurtosis4903.543102
Mean317.6503847
Median Absolute Deviation (MAD)0
Skewness57.80779995
Sum18868432.85
Variance8985453.232
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
04163970.1%
 
50031025.2%
 
5024724.2%
 
100014882.5%
 
2014632.5%
 
20012202.1%
 
1008161.4%
 
108061.4%
 
307431.3%
 
20007041.2%
 
2505691.0%
 
3005570.9%
 
50004500.8%
 
53760.6%
 
253560.6%
 
30003340.6%
 
12002670.4%
 
15001970.3%
 
61900.3%
 
6001760.3%
 
40001560.3%
 
24001450.2%
 
25001390.2%
 
60001250.2%
 
7690.1%
 
Other values (73)8411.4%
 
ValueCountFrequency (%) 
04163970.1%
 
0.23< 0.1%
 
0.251< 0.1%
 
13< 0.1%
 
213< 0.1%
 
53760.6%
 
61900.3%
 
7690.1%
 
91< 0.1%
 
108061.4%
 
ValueCountFrequency (%) 
3500001< 0.1%
 
2500001< 0.1%
 
2000001< 0.1%
 
1700001< 0.1%
 
1380001< 0.1%
 
1200001< 0.1%
 
1170007< 0.1%
 
1000003< 0.1%
 
700001< 0.1%
 
600001< 0.1%
 

date_recorded
Categorical

HIGH CARDINALITY

Distinct count356
Unique (%)0.6%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
2011-03-15
 
572
2011-03-17
 
558
2013-02-03
 
546
2011-03-14
 
520
2011-03-16
 
513
Other values (351)
56691
ValueCountFrequency (%) 
2011-03-155721.0%
 
2011-03-175580.9%
 
2013-02-035460.9%
 
2011-03-145200.9%
 
2011-03-165130.9%
 
2011-03-184970.8%
 
2011-03-194660.8%
 
2013-02-044640.8%
 
2013-01-294590.8%
 
2011-03-044580.8%
 
2013-02-144440.7%
 
2013-01-244350.7%
 
2011-03-054340.7%
 
2013-02-154290.7%
 
2013-03-154280.7%
 
2011-03-114260.7%
 
2013-01-304210.7%
 
2013-02-164180.7%
 
2011-03-234170.7%
 
2011-03-094160.7%
 
2013-01-184090.7%
 
2013-02-263910.7%
 
2011-03-303910.7%
 
2011-03-243810.6%
 
2013-03-193810.6%
 
Other values (331)4812681.0%
 

Length

Max length10
Median length10
Mean length10
Min length10

Overview of Unicode Properties

Unique unicode characters11
Unique unicode categories (?)2
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
013905923.4%
 
112901221.7%
 
-11880020.0%
 
210386717.5%
 
3528208.9%
 
7128532.2%
 
4107121.8%
 
893631.6%
 
661541.0%
 
560341.0%
 
953260.9%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number47520080.0%
 
Dash Punctuation11880020.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
013905929.3%
 
112901227.1%
 
210386721.9%
 
35282011.1%
 
7128532.7%
 
4107122.3%
 
893632.0%
 
661541.3%
 
560341.3%
 
953261.1%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-118800100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Common594000100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
013905923.4%
 
112901221.7%
 
-11880020.0%
 
210386717.5%
 
3528208.9%
 
7128532.2%
 
4107121.8%
 
893631.6%
 
661541.0%
 
560341.0%
 
953260.9%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII594000100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
013905923.4%
 
112901221.7%
 
-11880020.0%
 
210386717.5%
 
3528208.9%
 
7128532.2%
 
4107121.8%
 
893631.6%
 
661541.0%
 
560341.0%
 
953260.9%
 

funder
Categorical

HIGH CARDINALITY
MISSING

Distinct count1897
Unique (%)3.4%
Missing3635
Missing (%)6.1%
Memory size464.2 KiB
Government Of Tanzania
9084
Danida
 
3114
Hesawa
 
2202
Rwssp
 
1374
World Bank
 
1349
Other values (1892)
38642
ValueCountFrequency (%) 
Government Of Tanzania908415.3%
 
Danida31145.2%
 
Hesawa22023.7%
 
Rwssp13742.3%
 
World Bank13492.3%
 
Kkkt12872.2%
 
World Vision12462.1%
 
Unicef10571.8%
 
Tasaf8771.5%
 
District Council8431.4%
 
Dhv8291.4%
 
Private Individual8261.4%
 
Dwsp8111.4%
 
07771.3%
 
Norad7651.3%
 
Germany Republi6101.0%
 
Tcrs6021.0%
 
Ministry Of Water5901.0%
 
Water5831.0%
 
Dwe4840.8%
 
Netherlands4700.8%
 
Hifab4500.8%
 
Adb4480.8%
 
Lga4420.7%
 
Amref4250.7%
 
Other values (1872)2422040.8%
 
(Missing)36356.1%
 

Length

Max length30
Median length6
Mean length9.505824916
Min length1

Overview of Unicode Properties

Unique unicode characters69
Unique unicode categories (?)9
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
a7183512.7%
 
n6511211.5%
 
i380116.7%
 
e374646.6%
 
346736.1%
 
r278794.9%
 
t230164.1%
 
o227414.0%
 
s172083.0%
 
d154642.7%
 
f153292.7%
 
m151402.7%
 
v130302.3%
 
T121102.1%
 
l112192.0%
 
G107221.9%
 
O106131.9%
 
z96871.7%
 
c92161.6%
 
w79711.4%
 
D79281.4%
 
u78841.4%
 
W73521.3%
 
p69921.2%
 
k64961.2%
 
Other values (44)5955410.5%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter43678577.4%
 
Uppercase Letter8970515.9%
 
Space Separator346736.1%
 
Other Punctuation13220.2%
 
Decimal Number8030.1%
 
Open Punctuation4370.1%
 
Close Punctuation4310.1%
 
Dash Punctuation3230.1%
 
Connector Punctuation167< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
T1211013.5%
 
G1072212.0%
 
O1061311.8%
 
D79288.8%
 
W73528.2%
 
C46795.2%
 
R44545.0%
 
H34623.9%
 
M31353.5%
 
K29623.3%
 
A29203.3%
 
S26533.0%
 
I24712.8%
 
B20572.3%
 
N20262.3%
 
P19842.2%
 
U18772.1%
 
V17952.0%
 
L14721.6%
 
F13861.5%
 
J8420.9%
 
E4440.5%
 
Y2330.3%
 
Q1110.1%
 
Z16< 0.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
a7183516.4%
 
n6511214.9%
 
i380118.7%
 
e374648.6%
 
r278796.4%
 
t230165.3%
 
o227415.2%
 
s172083.9%
 
d154643.5%
 
f153293.5%
 
m151403.5%
 
v130303.0%
 
l112192.6%
 
z96872.2%
 
c92162.1%
 
w79711.8%
 
u78841.8%
 
p69921.6%
 
k64961.5%
 
h56941.3%
 
g30740.7%
 
b27350.6%
 
y26790.6%
 
x5650.1%
 
j3130.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
34673100.0%
 

Most frequent Open Punctuation characters

ValueCountFrequency (%) 
(43499.3%
 
[30.7%
 

Most frequent Close Punctuation characters

ValueCountFrequency (%) 
)42999.5%
 
]20.5%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
/78359.2%
 
.46935.5%
 
\332.5%
 
&262.0%
 
'110.8%
 

Most frequent Connector Punctuation characters

ValueCountFrequency (%) 
_167100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
079398.8%
 
250.6%
 
120.2%
 
920.2%
 
410.1%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-323100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin52649093.2%
 
Common381566.8%
 

Most frequent Latin characters

ValueCountFrequency (%) 
a7183513.6%
 
n6511212.4%
 
i380117.2%
 
e374647.1%
 
r278795.3%
 
t230164.4%
 
o227414.3%
 
s172083.3%
 
d154642.9%
 
f153292.9%
 
m151402.9%
 
v130302.5%
 
T121102.3%
 
l112192.1%
 
G107222.0%
 
O106132.0%
 
z96871.8%
 
c92161.8%
 
w79711.5%
 
D79281.5%
 
u78841.5%
 
W73521.4%
 
p69921.3%
 
k64961.2%
 
h56941.1%
 
Other values (27)503779.6%
 

Most frequent Common characters

ValueCountFrequency (%) 
3467390.9%
 
07932.1%
 
/7832.1%
 
.4691.2%
 
(4341.1%
 
)4291.1%
 
-3230.8%
 
_1670.4%
 
\330.1%
 
&260.1%
 
'11< 0.1%
 
25< 0.1%
 
[3< 0.1%
 
12< 0.1%
 
]2< 0.1%
 
92< 0.1%
 
41< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII564646100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
a7183512.7%
 
n6511211.5%
 
i380116.7%
 
e374646.6%
 
346736.1%
 
r278794.9%
 
t230164.1%
 
o227414.0%
 
s172083.0%
 
d154642.7%
 
f153292.7%
 
m151402.7%
 
v130302.3%
 
T121102.1%
 
l112192.0%
 
G107221.9%
 
O106131.9%
 
z96871.7%
 
c92161.6%
 
w79711.4%
 
D79281.4%
 
u78841.4%
 
W73521.3%
 
p69921.2%
 
k64961.2%
 
Other values (44)5955410.5%
 

gps_height
Real number (ℝ)

ZEROS

Distinct count2428
Unique (%)4.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean668.297239057239
Minimum-90
Maximum2770
Zeros20438
Zeros (%)34.4%
Memory size464.2 KiB

Quantile statistics

Minimum-90
5-th percentile0
Q10
median369
Q31319.25
95-th percentile1797
Maximum2770
Range2860
Interquartile range (IQR)1319.25

Descriptive statistics

Standard deviation693.1163503
Coefficient of variation (CV)1.037137833
Kurtosis-1.292440135
Mean668.2972391
Median Absolute Deviation (MAD)369
Skewness0.462402085
Sum39696856
Variance480410.2751
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
02043834.4%
 
-15600.1%
 
-16550.1%
 
-13550.1%
 
-20520.1%
 
1290520.1%
 
-14510.1%
 
303510.1%
 
-18490.1%
 
-19470.1%
 
1269460.1%
 
1295460.1%
 
1304450.1%
 
-23450.1%
 
280440.1%
 
1538440.1%
 
1286440.1%
 
-8440.1%
 
-17440.1%
 
1332430.1%
 
320430.1%
 
1317420.1%
 
1293420.1%
 
1319420.1%
 
1359420.1%
 
Other values (2403)3783463.7%
 
ValueCountFrequency (%) 
-901< 0.1%
 
-632< 0.1%
 
-591< 0.1%
 
-571< 0.1%
 
-551< 0.1%
 
-541< 0.1%
 
-531< 0.1%
 
-522< 0.1%
 
-512< 0.1%
 
-505< 0.1%
 
ValueCountFrequency (%) 
27701< 0.1%
 
26281< 0.1%
 
26271< 0.1%
 
26262< 0.1%
 
26231< 0.1%
 
26141< 0.1%
 
25851< 0.1%
 
25761< 0.1%
 
25691< 0.1%
 
25681< 0.1%
 

installer
Categorical

HIGH CARDINALITY
MISSING

Distinct count2145
Unique (%)3.8%
Missing3655
Missing (%)6.2%
Memory size464.2 KiB
DWE
17402
Government
 
1825
RWE
 
1206
Commu
 
1060
DANIDA
 
1050
Other values (2140)
33202
ValueCountFrequency (%) 
DWE1740229.3%
 
Government18253.1%
 
RWE12062.0%
 
Commu10601.8%
 
DANIDA10501.8%
 
KKKT8981.5%
 
Hesawa8401.4%
 
07771.3%
 
TCRS7071.2%
 
Central government6221.0%
 
CES6101.0%
 
Community5530.9%
 
DANID5520.9%
 
District Council5510.9%
 
HESAWA5390.9%
 
LGA4080.7%
 
World vision4080.7%
 
WEDECO3970.7%
 
TASAF3960.7%
 
District council3920.7%
 
Gover3830.6%
 
AMREF3290.6%
 
TWESA3160.5%
 
WU3010.5%
 
Dmdd2870.5%
 
Other values (2120)2293638.6%
 
(Missing)36556.2%
 

Length

Max length30
Median length4
Mean length5.91976431
Min length1

Overview of Unicode Properties

Unique unicode characters70
Unique unicode categories (?)10
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
D275957.8%
 
W258497.4%
 
E253897.2%
 
n238686.8%
 
a209986.0%
 
e155004.4%
 
i150534.3%
 
A136683.9%
 
r133773.8%
 
t129043.7%
 
126733.6%
 
o123983.5%
 
C105353.0%
 
m92892.6%
 
S66591.9%
 
R65181.9%
 
l62011.8%
 
s61731.8%
 
I61601.8%
 
T59481.7%
 
u54361.5%
 
K53901.5%
 
c48351.4%
 
N46741.3%
 
G44661.3%
 
Other values (45)5007814.2%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter16915548.1%
 
Uppercase Letter16743847.6%
 
Space Separator126733.6%
 
Other Punctuation9710.3%
 
Decimal Number7830.2%
 
Dash Punctuation2680.1%
 
Connector Punctuation169< 0.1%
 
Open Punctuation159< 0.1%
 
Close Punctuation16< 0.1%
 
Currency Symbol2< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
D2759516.5%
 
W2584915.4%
 
E2538915.2%
 
A136688.2%
 
C105356.3%
 
S66594.0%
 
R65183.9%
 
I61603.7%
 
T59483.6%
 
K53903.2%
 
N46742.8%
 
G44662.7%
 
M42572.5%
 
H34552.1%
 
O31491.9%
 
F31091.9%
 
L25091.5%
 
U22281.3%
 
P19511.2%
 
V15830.9%
 
B7960.5%
 
J7620.5%
 
X3560.2%
 
Y2450.1%
 
Z1290.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n2386814.1%
 
a2099812.4%
 
e155009.2%
 
i150538.9%
 
r133777.9%
 
t129047.6%
 
o123987.3%
 
m92895.5%
 
l62013.7%
 
s61733.6%
 
u54363.2%
 
c48352.9%
 
v44332.6%
 
d42102.5%
 
w33382.0%
 
g26791.6%
 
y17941.1%
 
h17021.0%
 
p14340.8%
 
k13930.8%
 
f8020.5%
 
b5050.3%
 
j4820.3%
 
z3230.2%
 
x14< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
12673100.0%
 

Most frequent Connector Punctuation characters

ValueCountFrequency (%) 
_169100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
/67069.0%
 
.23824.5%
 
&505.1%
 
'121.2%
 
#10.1%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
078099.6%
 
110.1%
 
410.1%
 
910.1%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-268100.0%
 

Most frequent Open Punctuation characters

ValueCountFrequency (%) 
(15798.7%
 
[21.3%
 

Most frequent Close Punctuation characters

ValueCountFrequency (%) 
}1381.2%
 
]212.5%
 
)16.2%
 

Most frequent Currency Symbol characters

ValueCountFrequency (%) 
$2100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin33659395.7%
 
Common150414.3%
 

Most frequent Latin characters

ValueCountFrequency (%) 
D275958.2%
 
W258497.7%
 
E253897.5%
 
n238687.1%
 
a209986.2%
 
e155004.6%
 
i150534.5%
 
A136684.1%
 
r133774.0%
 
t129043.8%
 
o123983.7%
 
C105353.1%
 
m92892.8%
 
S66592.0%
 
R65181.9%
 
l62011.8%
 
s61731.8%
 
I61601.8%
 
T59481.8%
 
u54361.6%
 
K53901.6%
 
c48351.4%
 
N46741.4%
 
G44661.3%
 
v44331.3%
 
Other values (27)4327712.9%
 

Most frequent Common characters

ValueCountFrequency (%) 
1267384.3%
 
07805.2%
 
/6704.5%
 
-2681.8%
 
.2381.6%
 
_1691.1%
 
(1571.0%
 
&500.3%
 
}130.1%
 
'120.1%
 
$2< 0.1%
 
[2< 0.1%
 
]2< 0.1%
 
)1< 0.1%
 
11< 0.1%
 
#1< 0.1%
 
41< 0.1%
 
91< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII351634100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
D275957.8%
 
W258497.4%
 
E253897.2%
 
n238686.8%
 
a209986.0%
 
e155004.4%
 
i150534.3%
 
A136683.9%
 
r133773.8%
 
t129043.7%
 
126733.6%
 
o123983.5%
 
C105353.0%
 
m92892.6%
 
S66591.9%
 
R65181.9%
 
l62011.8%
 
s61731.8%
 
I61601.8%
 
T59481.7%
 
u54361.5%
 
K53901.5%
 
c48351.4%
 
N46741.3%
 
G44661.3%
 
Other values (45)5007814.2%
 

longitude
Real number (ℝ≥0)

ZEROS

Distinct count57516
Unique (%)96.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean34.077426692028794
Minimum0.0
Maximum40.34519307
Zeros1812
Zeros (%)3.1%
Memory size464.2 KiB

Quantile statistics

Minimum0
5-th percentile30.04066001
Q133.09034738
median34.90874343
Q337.17838657
95-th percentile39.13323954
Maximum40.34519307
Range40.34519307
Interquartile range (IQR)4.08803919

Descriptive statistics

Standard deviation6.567431846
Coefficient of variation (CV)0.1927208854
Kurtosis19.18703105
Mean34.07742669
Median Absolute Deviation (MAD)2.032511095
Skewness-4.191046455
Sum2024199.146
Variance43.13116105
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
018123.1%
 
37.540900642< 0.1%
 
33.010509772< 0.1%
 
39.093483892< 0.1%
 
32.97271872< 0.1%
 
33.006275482< 0.1%
 
39.103950182< 0.1%
 
37.542784972< 0.1%
 
36.802489882< 0.1%
 
39.098373982< 0.1%
 
33.090347382< 0.1%
 
33.005031582< 0.1%
 
32.97806242< 0.1%
 
39.088875132< 0.1%
 
31.619529532< 0.1%
 
39.093095442< 0.1%
 
39.105306612< 0.1%
 
32.936689432< 0.1%
 
32.987511182< 0.1%
 
39.090879792< 0.1%
 
37.314250272< 0.1%
 
32.984789632< 0.1%
 
39.091433912< 0.1%
 
37.274352432< 0.1%
 
32.919861392< 0.1%
 
Other values (57491)5754096.9%
 
ValueCountFrequency (%) 
018123.1%
 
29.60712191< 0.1%
 
29.607201091< 0.1%
 
29.610320561< 0.1%
 
29.610964821< 0.1%
 
29.611946741< 0.1%
 
29.612506891< 0.1%
 
29.612762961< 0.1%
 
29.613443091< 0.1%
 
29.61687181< 0.1%
 
ValueCountFrequency (%) 
40.345193071< 0.1%
 
40.344300891< 0.1%
 
40.325239961< 0.1%
 
40.325226431< 0.1%
 
40.323401811< 0.1%
 
40.322832371< 0.1%
 
40.322804531< 0.1%
 
40.32262511< 0.1%
 
40.322169021< 0.1%
 
40.321965931< 0.1%
 

latitude
Real number (ℝ)

Distinct count57517
Unique (%)96.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-5.706032659626431
Minimum-11.64944018
Maximum-2e-08
Zeros0
Zeros (%)0.0%
Memory size464.2 KiB

Quantile statistics

Minimum-11.64944018
5-th percentile-10.58554992
Q1-8.540621305
median-5.02159665
Q3-3.32615564
95-th percentile-1.408872227
Maximum-2e-08
Range11.64944016
Interquartile range (IQR)5.214465665

Descriptive statistics

Standard deviation2.946019081
Coefficient of variation (CV)-0.5162990219
Kurtosis-1.057616666
Mean-5.70603266
Median Absolute Deviation (MAD)2.07002988
Skewness-0.1520365709
Sum-338938.34
Variance8.679028427
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
-2e-0818123.1%
 
-6.985841732< 0.1%
 
-3.797578612< 0.1%
 
-6.981884192< 0.1%
 
-7.104625032< 0.1%
 
-7.056922532< 0.1%
 
-7.175174432< 0.1%
 
-6.990730942< 0.1%
 
-6.97875552< 0.1%
 
-6.994704012< 0.1%
 
-2.494545592< 0.1%
 
-6.96425762< 0.1%
 
-2.506589542< 0.1%
 
-6.990548642< 0.1%
 
-2.485226582< 0.1%
 
-2.49435332< 0.1%
 
-6.962475162< 0.1%
 
-6.989456222< 0.1%
 
-6.957328452< 0.1%
 
-6.958715922< 0.1%
 
-6.992611442< 0.1%
 
-6.991294112< 0.1%
 
-7.177154782< 0.1%
 
-2.501627442< 0.1%
 
-1.7933422< 0.1%
 
Other values (57492)5754096.9%
 
ValueCountFrequency (%) 
-11.649440181< 0.1%
 
-11.648377591< 0.1%
 
-11.586296561< 0.1%
 
-11.568576791< 0.1%
 
-11.566804571< 0.1%
 
-11.564508651< 0.1%
 
-11.564323571< 0.1%
 
-11.562315921< 0.1%
 
-11.562288981< 0.1%
 
-11.561618981< 0.1%
 
ValueCountFrequency (%) 
-2e-0818123.1%
 
-0.998464351< 0.1%
 
-0.9989161< 0.1%
 
-0.999012091< 0.1%
 
-0.999117021< 0.1%
 
-0.99946921< 0.1%
 
-0.999506511< 0.1%
 
-0.999522321< 0.1%
 
-1.000585191< 0.1%
 
-1.00152081< 0.1%
 

wpt_name
Categorical

HIGH CARDINALITY

Distinct count37400
Unique (%)63.0%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
none
 
3563
Shuleni
 
1748
Zahanati
 
830
Msikitini
 
535
Kanisani
 
323
Other values (37395)
52401
ValueCountFrequency (%) 
none35636.0%
 
Shuleni17482.9%
 
Zahanati8301.4%
 
Msikitini5350.9%
 
Kanisani3230.5%
 
Bombani2710.5%
 
Sokoni2600.4%
 
Ofisini2540.4%
 
School2080.4%
 
Shule Ya Msingi1990.3%
 
Shule1520.3%
 
Sekondari1460.2%
 
Muungano1330.2%
 
Mkombozi1110.2%
 
Madukani1040.2%
 
Mbugani940.2%
 
Hospital940.2%
 
Upendo930.2%
 
Kituo Cha Afya900.2%
 
Mkuyuni880.1%
 
Umoja840.1%
 
Center830.1%
 
Ccm810.1%
 
Kisimani780.1%
 
Mtakuja760.1%
 
Other values (37375)4970283.7%
 

Length

Max length30
Median length10
Mean length10.96210438
Min length1

Overview of Unicode Properties

Unique unicode characters75
Unique unicode categories (?)10
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
a9880615.2%
 
i524048.0%
 
498987.7%
 
n421486.5%
 
e409856.3%
 
w316694.9%
 
K313854.8%
 
o302474.6%
 
u242173.7%
 
M220403.4%
 
l209543.2%
 
m176312.7%
 
h172152.6%
 
s167752.6%
 
r141432.2%
 
g130142.0%
 
t115731.8%
 
k110461.7%
 
S107521.7%
 
b104381.6%
 
d103891.6%
 
y77841.2%
 
z63001.0%
 
c50440.8%
 
N48800.7%
 
Other values (50)494127.6%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter49342275.8%
 
Uppercase Letter10518516.2%
 
Space Separator498987.7%
 
Decimal Number16800.3%
 
Other Punctuation7410.1%
 
Dash Punctuation104< 0.1%
 
Open Punctuation37< 0.1%
 
Close Punctuation37< 0.1%
 
Connector Punctuation24< 0.1%
 
Modifier Symbol21< 0.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
a9880620.0%
 
i5240410.6%
 
n421488.5%
 
e409858.3%
 
w316696.4%
 
o302476.1%
 
u242174.9%
 
l209544.2%
 
m176313.6%
 
h172153.5%
 
s167753.4%
 
r141432.9%
 
g130142.6%
 
t115732.3%
 
k110462.2%
 
b104382.1%
 
d103892.1%
 
y77841.6%
 
z63001.3%
 
c50441.0%
 
p35840.7%
 
j34940.7%
 
f23030.5%
 
v10580.2%
 
x127< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
K3138529.8%
 
M2204021.0%
 
S1075210.2%
 
N48804.6%
 
A34973.3%
 
B34253.3%
 
C27912.7%
 
P25642.4%
 
L25072.4%
 
J23852.3%
 
Y20051.9%
 
T19261.8%
 
I18511.8%
 
H16231.5%
 
R16201.5%
 
Z15261.5%
 
D14171.3%
 
G13181.3%
 
O12261.2%
 
E12091.1%
 
U10421.0%
 
W9100.9%
 
F8220.8%
 
V4040.4%
 
Q530.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
49898100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
'41756.3%
 
.17523.6%
 
/14619.7%
 
&20.3%
 
\10.1%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-104100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
150730.2%
 
243926.1%
 
31529.0%
 
41207.1%
 
71066.3%
 
5865.1%
 
6804.8%
 
8754.5%
 
9704.2%
 
0452.7%
 

Most frequent Open Punctuation characters

ValueCountFrequency (%) 
(2978.4%
 
[821.6%
 

Most frequent Close Punctuation characters

ValueCountFrequency (%) 
)2978.4%
 
]821.6%
 

Most frequent Connector Punctuation characters

ValueCountFrequency (%) 
_24100.0%
 

Most frequent Modifier Symbol characters

ValueCountFrequency (%) 
`21100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin59860791.9%
 
Common525428.1%
 

Most frequent Latin characters

ValueCountFrequency (%) 
a9880616.5%
 
i524048.8%
 
n421487.0%
 
e409856.8%
 
w316695.3%
 
K313855.2%
 
o302475.1%
 
u242174.0%
 
M220403.7%
 
l209543.5%
 
m176312.9%
 
h172152.9%
 
s167752.8%
 
r141432.4%
 
g130142.2%
 
t115731.9%
 
k110461.8%
 
S107521.8%
 
b104381.7%
 
d103891.7%
 
y77841.3%
 
z63001.1%
 
c50440.8%
 
N48800.8%
 
p35840.6%
 
Other values (27)431847.2%
 

Most frequent Common characters

ValueCountFrequency (%) 
4989895.0%
 
15071.0%
 
24390.8%
 
'4170.8%
 
.1750.3%
 
31520.3%
 
/1460.3%
 
41200.2%
 
71060.2%
 
-1040.2%
 
5860.2%
 
6800.2%
 
8750.1%
 
9700.1%
 
0450.1%
 
(290.1%
 
)290.1%
 
_24< 0.1%
 
`21< 0.1%
 
[8< 0.1%
 
]8< 0.1%
 
&2< 0.1%
 
\1< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII651149100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
a9880615.2%
 
i524048.0%
 
498987.7%
 
n421486.5%
 
e409856.3%
 
w316694.9%
 
K313854.8%
 
o302474.6%
 
u242173.7%
 
M220403.4%
 
l209543.2%
 
m176312.7%
 
h172152.6%
 
s167752.6%
 
r141432.2%
 
g130142.0%
 
t115731.8%
 
k110461.7%
 
S107521.7%
 
b104381.6%
 
d103891.6%
 
y77841.2%
 
z63001.0%
 
c50440.8%
 
N48800.7%
 
Other values (50)494127.6%
 

num_private
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct count65
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.47414141414141414
Minimum0
Maximum1776
Zeros58643
Zeros (%)98.7%
Memory size464.2 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum1776
Range1776
Interquartile range (IQR)0

Descriptive statistics

Standard deviation12.23622981
Coefficient of variation (CV)25.80713147
Kurtosis11137.29521
Mean0.4741414141
Median Absolute Deviation (MAD)0
Skewness91.93374999
Sum28164
Variance149.72532
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
05864398.7%
 
6810.1%
 
1730.1%
 
5460.1%
 
8460.1%
 
32400.1%
 
45360.1%
 
15350.1%
 
39300.1%
 
9328< 0.1%
 
327< 0.1%
 
726< 0.1%
 
223< 0.1%
 
6522< 0.1%
 
4721< 0.1%
 
10220< 0.1%
 
420< 0.1%
 
1717< 0.1%
 
8015< 0.1%
 
2014< 0.1%
 
2512< 0.1%
 
1111< 0.1%
 
4110< 0.1%
 
3410< 0.1%
 
168< 0.1%
 
Other values (40)860.1%
 
ValueCountFrequency (%) 
05864398.7%
 
1730.1%
 
223< 0.1%
 
327< 0.1%
 
420< 0.1%
 
5460.1%
 
6810.1%
 
726< 0.1%
 
8460.1%
 
94< 0.1%
 
ValueCountFrequency (%) 
17761< 0.1%
 
14021< 0.1%
 
7551< 0.1%
 
6981< 0.1%
 
6721< 0.1%
 
6681< 0.1%
 
4501< 0.1%
 
3001< 0.1%
 
2801< 0.1%
 
2401< 0.1%
 

basin
Categorical

Distinct count9
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
Lake Victoria
10248
Pangani
8940
Rufiji
7976
Internal
7785
Lake Tanganyika
6432
Other values (4)
18019
ValueCountFrequency (%) 
Lake Victoria1024817.3%
 
Pangani894015.1%
 
Rufiji797613.4%
 
Internal778513.1%
 
Lake Tanganyika643210.8%
 
Wami / Ruvu598710.1%
 
Lake Nyasa50858.6%
 
Ruvuma / Southern Coast44937.6%
 
Lake Rukwa24544.1%
 

Length

Max length23
Median length10
Mean length10.8923569
Min length6

Overview of Unicode Properties

Unique unicode characters32
Unique unicode categories (?)4
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
a10702516.5%
 
i578078.9%
 
n508077.9%
 
496727.7%
 
e364975.6%
 
u358835.5%
 
k331055.1%
 
t270194.2%
 
L242193.7%
 
r225263.5%
 
R209103.2%
 
o192343.0%
 
g153722.4%
 
y115171.8%
 
v104801.6%
 
m104801.6%
 
/104801.6%
 
V102481.6%
 
c102481.6%
 
s95781.5%
 
P89401.4%
 
f79761.2%
 
j79761.2%
 
I77851.2%
 
l77851.2%
 
Other values (7)334375.2%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter48826275.5%
 
Uppercase Letter9859215.2%
 
Space Separator496727.7%
 
Other Punctuation104801.6%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
L2421924.6%
 
R2091021.2%
 
V1024810.4%
 
P89409.1%
 
I77857.9%
 
T64326.5%
 
W59876.1%
 
N50855.2%
 
S44934.6%
 
C44934.6%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
a10702521.9%
 
i5780711.8%
 
n5080710.4%
 
e364977.5%
 
u358837.3%
 
k331056.8%
 
t270195.5%
 
r225264.6%
 
o192343.9%
 
g153723.1%
 
y115172.4%
 
v104802.1%
 
m104802.1%
 
c102482.1%
 
s95782.0%
 
f79761.6%
 
j79761.6%
 
l77851.6%
 
h44930.9%
 
w24540.5%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
49672100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
/10480100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin58685490.7%
 
Common601529.3%
 

Most frequent Latin characters

ValueCountFrequency (%) 
a10702518.2%
 
i578079.9%
 
n508078.7%
 
e364976.2%
 
u358836.1%
 
k331055.6%
 
t270194.6%
 
L242194.1%
 
r225263.8%
 
R209103.6%
 
o192343.3%
 
g153722.6%
 
y115172.0%
 
v104801.8%
 
m104801.8%
 
V102481.7%
 
c102481.7%
 
s95781.6%
 
P89401.5%
 
f79761.4%
 
j79761.4%
 
I77851.3%
 
l77851.3%
 
T64321.1%
 
W59871.0%
 
Other values (5)210183.6%
 

Most frequent Common characters

ValueCountFrequency (%) 
4967282.6%
 
/1048017.4%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII647006100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
a10702516.5%
 
i578078.9%
 
n508077.9%
 
496727.7%
 
e364975.6%
 
u358835.5%
 
k331055.1%
 
t270194.2%
 
L242193.7%
 
r225263.5%
 
R209103.2%
 
o192343.0%
 
g153722.4%
 
y115171.8%
 
v104801.6%
 
m104801.6%
 
/104801.6%
 
V102481.6%
 
c102481.6%
 
s95781.5%
 
P89401.4%
 
f79761.2%
 
j79761.2%
 
I77851.2%
 
l77851.2%
 
Other values (7)334375.2%
 

subvillage
Categorical

HIGH CARDINALITY

Distinct count19287
Unique (%)32.7%
Missing371
Missing (%)0.6%
Memory size464.2 KiB
Madukani
 
508
Shuleni
 
506
Majengo
 
502
Kati
 
373
Mtakuja
 
262
Other values (19282)
56878
ValueCountFrequency (%) 
Madukani5080.9%
 
Shuleni5060.9%
 
Majengo5020.8%
 
Kati3730.6%
 
Mtakuja2620.4%
 
Sokoni2320.4%
 
M1870.3%
 
Muungano1720.3%
 
Mbuyuni1640.3%
 
Mlimani1520.3%
 
Songambele1470.2%
 
Miembeni1340.2%
 
Msikitini1340.2%
 
11320.2%
 
Kibaoni1140.2%
 
Kanisani1110.2%
 
Mapinduzi1090.2%
 
I1090.2%
 
Mjimwema1080.2%
 
Mjini1080.2%
 
Mkwajuni1040.2%
 
Mwenge1020.2%
 
Azimio980.2%
 
Mabatini980.2%
 
Mission950.2%
 
Other values (19262)5426891.4%
 
(Missing)3710.6%
 

Length

Max length30
Median length7
Mean length7.867003367
Min length1

Overview of Unicode Properties

Unique unicode characters73
Unique unicode categories (?)10
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
a7237415.5%
 
i456669.8%
 
n342417.3%
 
u264245.7%
 
e256715.5%
 
o235565.0%
 
M204314.4%
 
g189514.1%
 
l163723.5%
 
m150533.2%
 
K125452.7%
 
b118432.5%
 
117662.5%
 
t117022.5%
 
k111162.4%
 
r100272.1%
 
w100032.1%
 
s99842.1%
 
h94302.0%
 
d82741.8%
 
y70551.5%
 
N60681.3%
 
B51121.1%
 
I45031.0%
 
j42850.9%
 
Other values (48)348487.5%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter38237681.8%
 
Uppercase Letter7129115.3%
 
Space Separator117662.5%
 
Other Punctuation11840.3%
 
Decimal Number5890.1%
 
Modifier Symbol45< 0.1%
 
Dash Punctuation36< 0.1%
 
Open Punctuation5< 0.1%
 
Close Punctuation5< 0.1%
 
Connector Punctuation3< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
M2043128.7%
 
K1254517.6%
 
N60688.5%
 
B51127.2%
 
I45036.3%
 
S40395.7%
 
A30764.3%
 
C25333.6%
 
L24583.4%
 
U17042.4%
 
T11231.6%
 
W10691.5%
 
R9051.3%
 
O8951.3%
 
G8941.3%
 
J7331.0%
 
D6290.9%
 
P4900.7%
 
H4890.7%
 
E3710.5%
 
Z3660.5%
 
V3330.5%
 
Y2830.4%
 
F1750.2%
 
Q670.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
a7237418.9%
 
i4566611.9%
 
n342419.0%
 
u264246.9%
 
e256716.7%
 
o235566.2%
 
g189515.0%
 
l163724.3%
 
m150533.9%
 
b118433.1%
 
t117023.1%
 
k111162.9%
 
r100272.6%
 
w100032.6%
 
s99842.6%
 
h94302.5%
 
d82742.2%
 
y70551.8%
 
j42851.1%
 
z37221.0%
 
p28250.7%
 
c15930.4%
 
f10980.3%
 
v10450.3%
 
q62< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
11766100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
'101785.9%
 
/13611.5%
 
.292.4%
 
#20.2%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
124241.1%
 
27011.9%
 
3508.5%
 
4498.3%
 
6335.6%
 
8325.4%
 
9325.4%
 
0305.1%
 
5294.9%
 
7223.7%
 

Most frequent Modifier Symbol characters

ValueCountFrequency (%) 
`45100.0%
 

Most frequent Open Punctuation characters

ValueCountFrequency (%) 
(480.0%
 
[120.0%
 

Most frequent Close Punctuation characters

ValueCountFrequency (%) 
)480.0%
 
]120.0%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-36100.0%
 

Most frequent Connector Punctuation characters

ValueCountFrequency (%) 
_3100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin45366797.1%
 
Common136332.9%
 

Most frequent Latin characters

ValueCountFrequency (%) 
a7237416.0%
 
i4566610.1%
 
n342417.5%
 
u264245.8%
 
e256715.7%
 
o235565.2%
 
M204314.5%
 
g189514.2%
 
l163723.6%
 
m150533.3%
 
K125452.8%
 
b118432.6%
 
t117022.6%
 
k111162.5%
 
r100272.2%
 
w100032.2%
 
s99842.2%
 
h94302.1%
 
d82741.8%
 
y70551.6%
 
N60681.3%
 
B51121.1%
 
I45031.0%
 
j42850.9%
 
S40390.9%
 
Other values (26)289426.4%
 

Most frequent Common characters

ValueCountFrequency (%) 
1176686.3%
 
'10177.5%
 
12421.8%
 
/1361.0%
 
2700.5%
 
3500.4%
 
4490.4%
 
`450.3%
 
-360.3%
 
6330.2%
 
8320.2%
 
9320.2%
 
0300.2%
 
5290.2%
 
.290.2%
 
7220.2%
 
(4< 0.1%
 
)4< 0.1%
 
_3< 0.1%
 
#2< 0.1%
 
[1< 0.1%
 
]1< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII467300100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
a7237415.5%
 
i456669.8%
 
n342417.3%
 
u264245.7%
 
e256715.5%
 
o235565.0%
 
M204314.4%
 
g189514.1%
 
l163723.5%
 
m150533.2%
 
K125452.7%
 
b118432.5%
 
117662.5%
 
t117022.5%
 
k111162.4%
 
r100272.1%
 
w100032.1%
 
s99842.1%
 
h94302.0%
 
d82741.8%
 
y70551.5%
 
N60681.3%
 
B51121.1%
 
I45031.0%
 
j42850.9%
 
Other values (48)348487.5%
 

region
Categorical

Distinct count21
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
Iringa
 
5294
Shinyanga
 
4982
Mbeya
 
4639
Kilimanjaro
 
4379
Morogoro
 
4006
Other values (16)
36100
ValueCountFrequency (%) 
Iringa52948.9%
 
Shinyanga49828.4%
 
Mbeya46397.8%
 
Kilimanjaro43797.4%
 
Morogoro40066.7%
 
Arusha33505.6%
 
Kagera33165.6%
 
Mwanza31025.2%
 
Kigoma28164.7%
 
Ruvuma26404.4%
 
Pwani26354.4%
 
Tanga25474.3%
 
Dodoma22013.7%
 
Singida20933.5%
 
Mara19693.3%
 
Tabora19593.3%
 
Rukwa18083.0%
 
Mtwara17302.9%
 
Manyara15832.7%
 
Lindi15462.6%
 
Dar es Salaam8051.4%
 

Length

Max length13
Median length6
Mean length6.623754209
Min length4

Overview of Unicode Properties

Unique unicode characters32
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
a8341321.2%
 
n331438.4%
 
r323978.2%
 
i317638.1%
 
o295807.5%
 
g250546.4%
 
M170294.3%
 
m128413.3%
 
y112042.8%
 
K105112.7%
 
u104382.7%
 
w92752.4%
 
e87602.2%
 
h83322.1%
 
S78802.0%
 
b65981.7%
 
d58401.5%
 
I52941.3%
 
l51841.3%
 
T45061.1%
 
R44481.1%
 
j43791.1%
 
s41551.1%
 
A33500.9%
 
z31020.8%
 
Other values (7)149753.8%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter33163684.3%
 
Uppercase Letter6020515.3%
 
Space Separator16100.4%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
M1702928.3%
 
K1051117.5%
 
S788013.1%
 
I52948.8%
 
T45067.5%
 
R44487.4%
 
A33505.6%
 
D30065.0%
 
P26354.4%
 
L15462.6%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
a8341325.2%
 
n3314310.0%
 
r323979.8%
 
i317639.6%
 
o295808.9%
 
g250547.6%
 
m128413.9%
 
y112043.4%
 
u104383.1%
 
w92752.8%
 
e87602.6%
 
h83322.5%
 
b65982.0%
 
d58401.8%
 
l51841.6%
 
j43791.3%
 
s41551.3%
 
z31020.9%
 
v26400.8%
 
k18080.5%
 
t17300.5%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
1610100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin39184199.6%
 
Common16100.4%
 

Most frequent Latin characters

ValueCountFrequency (%) 
a8341321.3%
 
n331438.5%
 
r323978.3%
 
i317638.1%
 
o295807.5%
 
g250546.4%
 
M170294.3%
 
m128413.3%
 
y112042.9%
 
K105112.7%
 
u104382.7%
 
w92752.4%
 
e87602.2%
 
h83322.1%
 
S78802.0%
 
b65981.7%
 
d58401.5%
 
I52941.4%
 
l51841.3%
 
T45061.1%
 
R44481.1%
 
j43791.1%
 
s41551.1%
 
A33500.9%
 
z31020.8%
 
Other values (6)133653.4%
 

Most frequent Common characters

ValueCountFrequency (%) 
1610100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII393451100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
a8341321.2%
 
n331438.4%
 
r323978.2%
 
i317638.1%
 
o295807.5%
 
g250546.4%
 
M170294.3%
 
m128413.3%
 
y112042.8%
 
K105112.7%
 
u104382.7%
 
w92752.4%
 
e87602.2%
 
h83322.1%
 
S78802.0%
 
b65981.7%
 
d58401.5%
 
I52941.3%
 
l51841.3%
 
T45061.1%
 
R44481.1%
 
j43791.1%
 
s41551.1%
 
A33500.9%
 
z31020.8%
 
Other values (7)149753.8%
 

region_code
Real number (ℝ≥0)

Distinct count27
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.297003367003366
Minimum1
Maximum99
Zeros0
Zeros (%)0.0%
Memory size464.2 KiB

Quantile statistics

Minimum1
5-th percentile2
Q15
median12
Q317
95-th percentile60
Maximum99
Range98
Interquartile range (IQR)12

Descriptive statistics

Standard deviation17.58740634
Coefficient of variation (CV)1.149728866
Kurtosis10.28843341
Mean15.29700337
Median Absolute Deviation (MAD)6
Skewness3.17381811
Sum908642
Variance309.3168617
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1153008.9%
 
1750118.4%
 
1246397.8%
 
343797.4%
 
540406.8%
 
1833245.6%
 
1930475.1%
 
230245.1%
 
1628164.7%
 
1026404.4%
 
425134.2%
 
122013.7%
 
1320933.5%
 
1419793.3%
 
2019693.3%
 
1518083.0%
 
616092.7%
 
2115832.7%
 
8012382.1%
 
6010251.7%
 
909171.5%
 
78051.4%
 
994230.7%
 
93900.7%
 
243260.5%
 
Other values (2)3010.5%
 
ValueCountFrequency (%) 
122013.7%
 
230245.1%
 
343797.4%
 
425134.2%
 
540406.8%
 
616092.7%
 
78051.4%
 
83000.5%
 
93900.7%
 
1026404.4%
 
ValueCountFrequency (%) 
994230.7%
 
909171.5%
 
8012382.1%
 
6010251.7%
 
401< 0.1%
 
243260.5%
 
2115832.7%
 
2019693.3%
 
1930475.1%
 
1833245.6%
 

district_code
Real number (ℝ≥0)

Distinct count20
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.629747474747475
Minimum0
Maximum80
Zeros23
Zeros (%)< 0.1%
Memory size464.2 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median3
Q35
95-th percentile30
Maximum80
Range80
Interquartile range (IQR)3

Descriptive statistics

Standard deviation9.633648629
Coefficient of variation (CV)1.711204396
Kurtosis16.21428363
Mean5.629747475
Median Absolute Deviation (MAD)1
Skewness3.962045299
Sum334407
Variance92.80718592
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
11220320.5%
 
21117318.8%
 
3999816.8%
 
4899915.1%
 
543567.3%
 
640746.9%
 
733435.6%
 
810431.8%
 
309951.7%
 
338741.5%
 
537451.3%
 
435050.9%
 
133910.7%
 
232930.5%
 
631950.3%
 
621090.2%
 
60630.1%
 
023< 0.1%
 
8012< 0.1%
 
676< 0.1%
 
ValueCountFrequency (%) 
023< 0.1%
 
11220320.5%
 
21117318.8%
 
3999816.8%
 
4899915.1%
 
543567.3%
 
640746.9%
 
733435.6%
 
810431.8%
 
133910.7%
 
ValueCountFrequency (%) 
8012< 0.1%
 
676< 0.1%
 
631950.3%
 
621090.2%
 
60630.1%
 
537451.3%
 
435050.9%
 
338741.5%
 
309951.7%
 
232930.5%
 

lga
Categorical

HIGH CARDINALITY

Distinct count125
Unique (%)0.2%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
Njombe
 
2503
Arusha Rural
 
1252
Moshi Rural
 
1251
Bariadi
 
1177
Rungwe
 
1106
Other values (120)
52111
ValueCountFrequency (%) 
Njombe25034.2%
 
Arusha Rural12522.1%
 
Moshi Rural12512.1%
 
Bariadi11772.0%
 
Rungwe11061.9%
 
Kilosa10941.8%
 
Kasulu10471.8%
 
Mbozi10341.7%
 
Meru10091.7%
 
Bagamoyo9971.7%
 
Singida Rural9951.7%
 
Kilombero9591.6%
 
Same8771.5%
 
Kibondo8741.5%
 
Kyela8591.4%
 
Kahama8361.4%
 
Kigoma Rural8241.4%
 
Magu8241.4%
 
Maswa8091.4%
 
Karagwe7711.3%
 
Mbinga7501.3%
 
Iringa Rural7281.2%
 
Serengeti7161.2%
 
Namtumbo6941.2%
 
Lushoto6941.2%
 
Other values (100)3472058.5%
 

Length

Max length16
Median length6
Mean length7.416885522
Min length3

Overview of Unicode Properties

Unique unicode characters41
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
a6998215.9%
 
o300796.8%
 
i294836.7%
 
u283246.4%
 
r268866.1%
 
e225795.1%
 
n225215.1%
 
l192384.4%
 
g183854.2%
 
M160173.6%
 
m156223.5%
 
b156033.5%
 
R122072.8%
 
K116632.6%
 
112352.6%
 
w98202.2%
 
s97472.2%
 
h84641.9%
 
d84101.9%
 
S62611.4%
 
N57601.3%
 
t56961.3%
 
B48391.1%
 
y47631.1%
 
k37210.8%
 
Other values (16)232585.3%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter35869381.4%
 
Uppercase Letter7063516.0%
 
Space Separator112352.6%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
M1601722.7%
 
R1220717.3%
 
K1166316.5%
 
S62618.9%
 
N57608.2%
 
B48396.9%
 
U34104.8%
 
I24803.5%
 
L21313.0%
 
T13671.9%
 
A13151.9%
 
H11531.6%
 
C8811.2%
 
G4880.7%
 
D3580.5%
 
P3050.4%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
a6998219.5%
 
o300798.4%
 
i294838.2%
 
u283247.9%
 
r268867.5%
 
e225796.3%
 
n225216.3%
 
l192385.4%
 
g183855.1%
 
m156224.4%
 
b156034.3%
 
w98202.7%
 
s97472.7%
 
h84642.4%
 
d84102.3%
 
t56961.6%
 
y47631.3%
 
k37211.0%
 
j34961.0%
 
z19430.5%
 
p18540.5%
 
f11060.3%
 
v6710.2%
 
c3000.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
11235100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin42932897.4%
 
Common112352.6%
 

Most frequent Latin characters

ValueCountFrequency (%) 
a6998216.3%
 
o300797.0%
 
i294836.9%
 
u283246.6%
 
r268866.3%
 
e225795.3%
 
n225215.2%
 
l192384.5%
 
g183854.3%
 
M160173.7%
 
m156223.6%
 
b156033.6%
 
R122072.8%
 
K116632.7%
 
w98202.3%
 
s97472.3%
 
h84642.0%
 
d84102.0%
 
S62611.5%
 
N57601.3%
 
t56961.3%
 
B48391.1%
 
y47631.1%
 
k37210.9%
 
j34960.8%
 
Other values (15)197624.6%
 

Most frequent Common characters

ValueCountFrequency (%) 
11235100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII440563100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
a6998215.9%
 
o300796.8%
 
i294836.7%
 
u283246.4%
 
r268866.1%
 
e225795.1%
 
n225215.1%
 
l192384.4%
 
g183854.2%
 
M160173.6%
 
m156223.5%
 
b156033.5%
 
R122072.8%
 
K116632.6%
 
112352.6%
 
w98202.2%
 
s97472.2%
 
h84641.9%
 
d84101.9%
 
S62611.4%
 
N57601.3%
 
t56961.3%
 
B48391.1%
 
y47631.1%
 
k37210.8%
 
Other values (16)232585.3%
 

ward
Categorical

HIGH CARDINALITY

Distinct count2092
Unique (%)3.5%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
Igosi
 
307
Imalinyi
 
252
Siha Kati
 
232
Mdandu
 
231
Nduruma
 
217
Other values (2087)
58161
ValueCountFrequency (%) 
Igosi3070.5%
 
Imalinyi2520.4%
 
Siha Kati2320.4%
 
Mdandu2310.4%
 
Nduruma2170.4%
 
Kitunda2030.3%
 
Mishamo2030.3%
 
Msindo2010.3%
 
Chalinze1960.3%
 
Maji ya Chai1900.3%
 
Usuka1870.3%
 
Ngarenanyuki1720.3%
 
Chanika1710.3%
 
Vikindu1620.3%
 
Mtwango1530.3%
 
Matola1450.2%
 
Zinga/Ikerege1410.2%
 
Maramba1390.2%
 
Wanging'ombe1390.2%
 
Itete1370.2%
 
Magomeni1350.2%
 
Kikatiti1340.2%
 
Ifakara1340.2%
 
Olkokola1330.2%
 
Maposeni1300.2%
 
Other values (2067)5495692.5%
 

Length

Max length23
Median length7
Mean length7.505841751
Min length3

Overview of Unicode Properties

Unique unicode characters54
Unique unicode categories (?)5
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
a6953315.6%
 
i402439.0%
 
n295846.6%
 
u270156.1%
 
o260935.9%
 
e235895.3%
 
g211664.7%
 
M189164.2%
 
m162163.6%
 
l157993.5%
 
r130572.9%
 
b128162.9%
 
s113352.5%
 
K112122.5%
 
h109752.5%
 
k108122.4%
 
t93112.1%
 
w91372.0%
 
d89602.0%
 
y71861.6%
 
I60941.4%
 
N59191.3%
 
54081.2%
 
z35770.8%
 
S33540.8%
 
Other values (29)285406.4%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter37473084.0%
 
Uppercase Letter6452314.5%
 
Space Separator54081.2%
 
Other Punctuation11630.3%
 
Dash Punctuation23< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
M1891629.3%
 
K1121217.4%
 
I60949.4%
 
N59199.2%
 
S33545.2%
 
L31624.9%
 
B30984.8%
 
U29134.5%
 
C21233.3%
 
R16922.6%
 
T7761.2%
 
D7431.2%
 
O6611.0%
 
V6341.0%
 
P5770.9%
 
H5510.9%
 
W3870.6%
 
G3690.6%
 
Z3390.5%
 
E2890.4%
 
A2600.4%
 
J1870.3%
 
Y1490.2%
 
Q760.1%
 
F420.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
a6953318.6%
 
i4024310.7%
 
n295847.9%
 
u270157.2%
 
o260937.0%
 
e235896.3%
 
g211665.6%
 
m162164.3%
 
l157994.2%
 
r130573.5%
 
b128163.4%
 
s113353.0%
 
h109752.9%
 
k108122.9%
 
t93112.5%
 
w91372.4%
 
d89602.4%
 
y71861.9%
 
z35771.0%
 
p28950.8%
 
j24460.7%
 
c13760.4%
 
f8160.2%
 
v7770.2%
 
q16< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
5408100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
'101387.1%
 
/15012.9%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-23100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin43925398.5%
 
Common65941.5%
 

Most frequent Latin characters

ValueCountFrequency (%) 
a6953315.8%
 
i402439.2%
 
n295846.7%
 
u270156.2%
 
o260935.9%
 
e235895.4%
 
g211664.8%
 
M189164.3%
 
m162163.7%
 
l157993.6%
 
r130573.0%
 
b128162.9%
 
s113352.6%
 
K112122.6%
 
h109752.5%
 
k108122.5%
 
t93112.1%
 
w91372.1%
 
d89602.0%
 
y71861.6%
 
I60941.4%
 
N59191.3%
 
z35770.8%
 
S33540.8%
 
L31620.7%
 
Other values (25)241925.5%
 

Most frequent Common characters

ValueCountFrequency (%) 
540882.0%
 
'101315.4%
 
/1502.3%
 
-230.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII445847100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
a6953315.6%
 
i402439.0%
 
n295846.6%
 
u270156.1%
 
o260935.9%
 
e235895.3%
 
g211664.7%
 
M189164.2%
 
m162163.6%
 
l157993.5%
 
r130572.9%
 
b128162.9%
 
s113352.5%
 
K112122.5%
 
h109752.5%
 
k108122.4%
 
t93112.1%
 
w91372.0%
 
d89602.0%
 
y71861.6%
 
I60941.4%
 
N59191.3%
 
54081.2%
 
z35770.8%
 
S33540.8%
 
Other values (29)285406.4%
 

population
Real number (ℝ≥0)

ZEROS

Distinct count1049
Unique (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean179.90998316498317
Minimum0
Maximum30500
Zeros21381
Zeros (%)36.0%
Memory size464.2 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median25
Q3215
95-th percentile680
Maximum30500
Range30500
Interquartile range (IQR)215

Descriptive statistics

Standard deviation471.4821757
Coefficient of variation (CV)2.620655994
Kurtosis402.2801153
Mean179.9099832
Median Absolute Deviation (MAD)25
Skewness12.66071359
Sum10686653
Variance222295.442
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
02138136.0%
 
1702511.8%
 
20019403.3%
 
15018923.2%
 
25016812.8%
 
30014762.5%
 
10011461.9%
 
5011391.9%
 
50010091.7%
 
3509861.7%
 
1209161.5%
 
4007751.3%
 
607061.2%
 
306261.1%
 
405520.9%
 
805330.9%
 
4504990.8%
 
204620.8%
 
6004380.7%
 
2303880.7%
 
752890.5%
 
10002780.5%
 
8002690.5%
 
902650.4%
 
1302640.4%
 
Other values (1024)1246521.0%
 
ValueCountFrequency (%) 
02138136.0%
 
1702511.8%
 
24< 0.1%
 
34< 0.1%
 
413< 0.1%
 
5440.1%
 
619< 0.1%
 
73< 0.1%
 
823< 0.1%
 
911< 0.1%
 
ValueCountFrequency (%) 
305001< 0.1%
 
153001< 0.1%
 
114631< 0.1%
 
100003< 0.1%
 
98651< 0.1%
 
95001< 0.1%
 
90003< 0.1%
 
88481< 0.1%
 
86001< 0.1%
 
85001< 0.1%
 

public_meeting
Boolean

MISSING

Distinct count2
Unique (%)< 0.1%
Missing3334
Missing (%)5.6%
Memory size464.2 KiB
True
51011
False
 
5055
(Missing)
 
3334
ValueCountFrequency (%) 
True5101185.9%
 
False50558.5%
 
(Missing)33345.6%
 

recorded_by
Categorical

CONSTANT
REJECTED

Distinct count1
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
GeoData Consultants Ltd
59400
ValueCountFrequency (%) 
GeoData Consultants Ltd59400100.0%
 

Length

Max length23
Median length23
Mean length23
Min length23

Overview of Unicode Properties

Unique unicode characters14
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
t23760017.4%
 
a17820013.0%
 
o1188008.7%
 
1188008.7%
 
n1188008.7%
 
s1188008.7%
 
G594004.3%
 
e594004.3%
 
D594004.3%
 
C594004.3%
 
u594004.3%
 
l594004.3%
 
L594004.3%
 
d594004.3%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter100980073.9%
 
Uppercase Letter23760017.4%
 
Space Separator1188008.7%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
G5940025.0%
 
D5940025.0%
 
C5940025.0%
 
L5940025.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
t23760023.5%
 
a17820017.6%
 
o11880011.8%
 
n11880011.8%
 
s11880011.8%
 
e594005.9%
 
u594005.9%
 
l594005.9%
 
d594005.9%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
118800100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin124740091.3%
 
Common1188008.7%
 

Most frequent Latin characters

ValueCountFrequency (%) 
t23760019.0%
 
a17820014.3%
 
o1188009.5%
 
n1188009.5%
 
s1188009.5%
 
G594004.8%
 
e594004.8%
 
D594004.8%
 
C594004.8%
 
u594004.8%
 
l594004.8%
 
L594004.8%
 
d594004.8%
 

Most frequent Common characters

ValueCountFrequency (%) 
118800100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII1366200100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
t23760017.4%
 
a17820013.0%
 
o1188008.7%
 
1188008.7%
 
n1188008.7%
 
s1188008.7%
 
G594004.3%
 
e594004.3%
 
D594004.3%
 
C594004.3%
 
u594004.3%
 
l594004.3%
 
L594004.3%
 
d594004.3%
 

scheme_management
Categorical

MISSING

Distinct count12
Unique (%)< 0.1%
Missing3877
Missing (%)6.5%
Memory size464.2 KiB
VWC
36793
WUG
 
5206
Water authority
 
3153
WUA
 
2883
Water Board
 
2748
Other values (7)
 
4740
ValueCountFrequency (%) 
VWC3679361.9%
 
WUG52068.8%
 
Water authority31535.3%
 
WUA28834.9%
 
Water Board27484.6%
 
Parastatal16802.8%
 
Private operator10631.8%
 
Company10611.8%
 
Other7661.3%
 
SWC970.2%
 
Trust720.1%
 
None1< 0.1%
 
(Missing)38776.5%
 

Length

Max length16
Median length3
Mean length4.537373737
Min length3

Overview of Unicode Properties

Unique unicode characters29
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
W5088018.9%
 
C3795114.1%
 
V3679313.7%
 
a255869.5%
 
t185316.9%
 
r175096.5%
 
o90893.4%
 
n88163.3%
 
e87943.3%
 
U80893.0%
 
69642.6%
 
G52061.9%
 
i42161.6%
 
y42141.6%
 
h39191.5%
 
u32251.2%
 
A28831.1%
 
B27481.0%
 
d27481.0%
 
P27431.0%
 
p21240.8%
 
s17520.7%
 
l16800.6%
 
v10630.4%
 
m10610.4%
 
Other values (4)9360.3%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter14822955.0%
 
Lowercase Letter11432742.4%
 
Space Separator69642.6%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
W5088034.3%
 
C3795125.6%
 
V3679324.8%
 
U80895.5%
 
G52063.5%
 
A28831.9%
 
B27481.9%
 
P27431.9%
 
O7660.5%
 
S970.1%
 
T72< 0.1%
 
N1< 0.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
a2558622.4%
 
t1853116.2%
 
r1750915.3%
 
o90898.0%
 
n88167.7%
 
e87947.7%
 
i42163.7%
 
y42143.7%
 
h39193.4%
 
u32252.8%
 
d27482.4%
 
p21241.9%
 
s17521.5%
 
l16801.5%
 
v10630.9%
 
m10610.9%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
6964100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin26255697.4%
 
Common69642.6%
 

Most frequent Latin characters

ValueCountFrequency (%) 
W5088019.4%
 
C3795114.5%
 
V3679314.0%
 
a255869.7%
 
t185317.1%
 
r175096.7%
 
o90893.5%
 
n88163.4%
 
e87943.3%
 
U80893.1%
 
G52062.0%
 
i42161.6%
 
y42141.6%
 
h39191.5%
 
u32251.2%
 
A28831.1%
 
B27481.0%
 
d27481.0%
 
P27431.0%
 
p21240.8%
 
s17520.7%
 
l16800.6%
 
v10630.4%
 
m10610.4%
 
O7660.3%
 
Other values (3)1700.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
6964100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII269520100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
W5088018.9%
 
C3795114.1%
 
V3679313.7%
 
a255869.5%
 
t185316.9%
 
r175096.5%
 
o90893.4%
 
n88163.3%
 
e87943.3%
 
U80893.0%
 
69642.6%
 
G52061.9%
 
i42161.6%
 
y42141.6%
 
h39191.5%
 
u32251.2%
 
A28831.1%
 
B27481.0%
 
d27481.0%
 
P27431.0%
 
p21240.8%
 
s17520.7%
 
l16800.6%
 
v10630.4%
 
m10610.4%
 
Other values (4)9360.3%
 

scheme_name
Categorical

HIGH CARDINALITY
MISSING

Distinct count2696
Unique (%)8.6%
Missing28166
Missing (%)47.4%
Memory size464.2 KiB
K
 
682
None
 
644
Borehole
 
546
Chalinze wate
 
405
M
 
400
Other values (2691)
28557
ValueCountFrequency (%) 
K6821.1%
 
None6441.1%
 
Borehole5460.9%
 
Chalinze wate4050.7%
 
M4000.7%
 
DANIDA3790.6%
 
Government3200.5%
 
Ngana water supplied scheme2700.5%
 
wanging'ombe water supply s2610.4%
 
wanging'ombe supply scheme2340.4%
 
I2290.4%
 
Bagamoyo wate2290.4%
 
Uroki-Bomang'ombe water sup2090.4%
 
N2040.3%
 
Kirua kahe gravity water supply trust1930.3%
 
Machumba estate pipe line1850.3%
 
Makwale water supplied sche1660.3%
 
Kijiji1610.3%
 
S1540.3%
 
Handeni Trunk Main(H1520.3%
 
mtwango water supply scheme1520.3%
 
Losaa-Kia water supply1520.3%
 
Mkongoro Two1470.2%
 
Roman1390.2%
 
Mkongoro One1280.2%
 
Other values (2671)2449341.2%
 
(Missing)2816647.4%
 

Length

Max length46
Median length3
Mean length8.94456229
Min length1

Overview of Unicode Properties

Unique unicode characters68
Unique unicode categories (?)9
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
a7675014.4%
 
n7409213.9%
 
412527.8%
 
e352396.6%
 
i264115.0%
 
p224514.2%
 
r218164.1%
 
t192163.6%
 
u184413.5%
 
o174183.3%
 
l173083.3%
 
s164303.1%
 
w163613.1%
 
m141472.7%
 
y121562.3%
 
g113402.1%
 
M93141.8%
 
h80461.5%
 
K56001.1%
 
d55381.0%
 
k53881.0%
 
b51351.0%
 
c49780.9%
 
N44390.8%
 
S37700.7%
 
Other values (43)382717.2%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter43768182.4%
 
Uppercase Letter500649.4%
 
Space Separator412527.8%
 
Other Punctuation13170.2%
 
Dash Punctuation5540.1%
 
Open Punctuation191< 0.1%
 
Decimal Number147< 0.1%
 
Modifier Symbol70< 0.1%
 
Close Punctuation31< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
M931418.6%
 
K560011.2%
 
N44398.9%
 
S37707.5%
 
A27295.5%
 
I26915.4%
 
W25315.1%
 
B23874.8%
 
L21074.2%
 
U17903.6%
 
D15763.1%
 
T15503.1%
 
C15273.1%
 
R14072.8%
 
E13362.7%
 
P10472.1%
 
H10232.0%
 
O9551.9%
 
G8991.8%
 
J3850.8%
 
V3690.7%
 
Y2680.5%
 
F2240.4%
 
Z980.2%
 
Q420.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
a7675017.5%
 
n7409216.9%
 
e352398.1%
 
i264116.0%
 
p224515.1%
 
r218165.0%
 
t192164.4%
 
u184414.2%
 
o174184.0%
 
l173084.0%
 
s164303.8%
 
w163613.7%
 
m141473.2%
 
y121562.8%
 
g113402.6%
 
h80461.8%
 
d55381.3%
 
k53881.2%
 
b51351.2%
 
c49781.1%
 
v32550.7%
 
j30620.7%
 
z17080.4%
 
f9550.2%
 
q36< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
41252100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
'93871.2%
 
/37028.1%
 
&80.6%
 
:10.1%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-554100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
26141.5%
 
35537.4%
 
774.8%
 
174.8%
 
474.8%
 
542.7%
 
032.0%
 
632.0%
 

Most frequent Open Punctuation characters

ValueCountFrequency (%) 
(191100.0%
 

Most frequent Close Punctuation characters

ValueCountFrequency (%) 
)31100.0%
 

Most frequent Modifier Symbol characters

ValueCountFrequency (%) 
`70100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin48774591.8%
 
Common435628.2%
 

Most frequent Latin characters

ValueCountFrequency (%) 
a7675015.7%
 
n7409215.2%
 
e352397.2%
 
i264115.4%
 
p224514.6%
 
r218164.5%
 
t192163.9%
 
u184413.8%
 
o174183.6%
 
l173083.5%
 
s164303.4%
 
w163613.4%
 
m141472.9%
 
y121562.5%
 
g113402.3%
 
M93141.9%
 
h80461.6%
 
K56001.1%
 
d55381.1%
 
k53881.1%
 
b51351.1%
 
c49781.0%
 
N44390.9%
 
S37700.8%
 
v32550.7%
 
Other values (26)327066.7%
 

Most frequent Common characters

ValueCountFrequency (%) 
4125294.7%
 
'9382.2%
 
-5541.3%
 
/3700.8%
 
(1910.4%
 
`700.2%
 
2610.1%
 
3550.1%
 
)310.1%
 
&8< 0.1%
 
77< 0.1%
 
17< 0.1%
 
47< 0.1%
 
54< 0.1%
 
03< 0.1%
 
63< 0.1%
 
:1< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII531307100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
a7675014.4%
 
n7409213.9%
 
412527.8%
 
e352396.6%
 
i264115.0%
 
p224514.2%
 
r218164.1%
 
t192163.6%
 
u184413.5%
 
o174183.3%
 
l173083.3%
 
s164303.1%
 
w163613.1%
 
m141472.7%
 
y121562.3%
 
g113402.1%
 
M93141.8%
 
h80461.5%
 
K56001.1%
 
d55381.0%
 
k53881.0%
 
b51351.0%
 
c49780.9%
 
N44390.8%
 
S37700.7%
 
Other values (43)382717.2%
 

permit
Boolean

MISSING

Distinct count2
Unique (%)< 0.1%
Missing3056
Missing (%)5.1%
Memory size464.2 KiB
True
38852
False
17492
(Missing)
 
3056
ValueCountFrequency (%) 
True3885265.4%
 
False1749229.4%
 
(Missing)30565.1%
 

construction_year
Real number (ℝ≥0)

ZEROS

Distinct count55
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1300.6524747474748
Minimum0
Maximum2013
Zeros20709
Zeros (%)34.9%
Memory size464.2 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median1986
Q32004
95-th percentile2010
Maximum2013
Range2013
Interquartile range (IQR)2004

Descriptive statistics

Standard deviation951.6205473
Coefficient of variation (CV)0.7316485885
Kurtosis-1.596432369
Mean1300.652475
Median Absolute Deviation (MAD)22
Skewness-0.6349277866
Sum77258757
Variance905581.6661
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
02070934.9%
 
201026454.5%
 
200826134.4%
 
200925334.3%
 
200020913.5%
 
200715872.7%
 
200614712.5%
 
200312862.2%
 
201112562.1%
 
200411231.9%
 
201210841.8%
 
200210751.8%
 
197810371.7%
 
199510141.7%
 
200510111.7%
 
19999791.6%
 
19989661.6%
 
19909541.6%
 
19859451.6%
 
19808111.4%
 
19968111.4%
 
19847791.3%
 
19827441.3%
 
19947381.2%
 
19727081.2%
 
Other values (30)843014.2%
 
ValueCountFrequency (%) 
02070934.9%
 
19601020.2%
 
196121< 0.1%
 
1962300.1%
 
1963850.1%
 
1964400.1%
 
196519< 0.1%
 
196617< 0.1%
 
1967880.1%
 
1968770.1%
 
ValueCountFrequency (%) 
20131760.3%
 
201210841.8%
 
201112562.1%
 
201026454.5%
 
200925334.3%
 
200826134.4%
 
200715872.7%
 
200614712.5%
 
200510111.7%
 
200411231.9%
 

extraction_type
Categorical

HIGH CORRELATION

Distinct count18
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
gravity
26780
nira/tanira
8154
other
6430
submersible
 
4764
swn 80
 
3670
Other values (13)
9602
ValueCountFrequency (%) 
gravity2678045.1%
 
nira/tanira815413.7%
 
other643010.8%
 
submersible47648.0%
 
swn 8036706.2%
 
mono28654.8%
 
india mark ii24004.0%
 
afridev17703.0%
 
ksb14152.4%
 
other - rope pump4510.8%
 
other - swn 812290.4%
 
windmill1170.2%
 
india mark iii980.2%
 
cemo900.2%
 
other - play pump850.1%
 
walimi480.1%
 
climax320.1%
 
other - mkulima/shinyanga2< 0.1%
 

Length

Max length25
Median length7
Mean length7.719511785
Min length3

Overview of Unicode Properties

Unique unicode characters29
Unique unicode categories (?)5
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
i6007813.1%
 
r5976813.0%
 
a5817912.7%
 
t421319.2%
 
v285506.2%
 
y268675.9%
 
g267825.8%
 
n256915.6%
 
e190364.2%
 
s148443.2%
 
o134682.9%
 
109652.4%
 
m109542.4%
 
b109432.4%
 
/81561.8%
 
h71991.6%
 
u53021.2%
 
l51651.1%
 
d43851.0%
 
w40640.9%
 
k39150.9%
 
838990.9%
 
036700.8%
 
f17700.4%
 
p16080.4%
 
Other values (4)11500.3%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter43085394.0%
 
Space Separator109652.4%
 
Other Punctuation81561.8%
 
Decimal Number77981.7%
 
Dash Punctuation7670.2%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
i6007813.9%
 
r5976813.9%
 
a5817913.5%
 
t421319.8%
 
v285506.6%
 
y268676.2%
 
g267826.2%
 
n256916.0%
 
e190364.4%
 
s148443.4%
 
o134683.1%
 
m109542.5%
 
b109432.5%
 
h71991.7%
 
u53021.2%
 
l51651.2%
 
d43851.0%
 
w40640.9%
 
k39150.9%
 
f17700.4%
 
p16080.4%
 
c122< 0.1%
 
x32< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
10965100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
8389950.0%
 
0367047.1%
 
12292.9%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
/8156100.0%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-767100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin43085394.0%
 
Common276866.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
i6007813.9%
 
r5976813.9%
 
a5817913.5%
 
t421319.8%
 
v285506.6%
 
y268676.2%
 
g267826.2%
 
n256916.0%
 
e190364.4%
 
s148443.4%
 
o134683.1%
 
m109542.5%
 
b109432.5%
 
h71991.7%
 
u53021.2%
 
l51651.2%
 
d43851.0%
 
w40640.9%
 
k39150.9%
 
f17700.4%
 
p16080.4%
 
c122< 0.1%
 
x32< 0.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
1096539.6%
 
/815629.5%
 
8389914.1%
 
0367013.3%
 
-7672.8%
 
12290.8%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII458539100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
i6007813.1%
 
r5976813.0%
 
a5817912.7%
 
t421319.2%
 
v285506.2%
 
y268675.9%
 
g267825.8%
 
n256915.6%
 
e190364.2%
 
s148443.2%
 
o134682.9%
 
109652.4%
 
m109542.4%
 
b109432.4%
 
/81561.8%
 
h71991.6%
 
u53021.2%
 
l51651.1%
 
d43851.0%
 
w40640.9%
 
k39150.9%
 
838990.9%
 
036700.8%
 
f17700.4%
 
p16080.4%
 
Other values (4)11500.3%
 

extraction_type_group
Categorical

HIGH CORRELATION

Distinct count13
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
gravity
26780
nira/tanira
8154
other
6430
submersible
6179
swn 80
 
3670
Other values (8)
8187
ValueCountFrequency (%) 
gravity2678045.1%
 
nira/tanira815413.7%
 
other643010.8%
 
submersible617910.4%
 
swn 8036706.2%
 
mono28654.8%
 
india mark ii24004.0%
 
afridev17703.0%
 
rope pump4510.8%
 
other handpump3640.6%
 
other motorpump1220.2%
 
wind-powered1170.2%
 
india mark iii980.2%
 

Length

Max length15
Median length7
Mean length7.880538721
Min length4

Overview of Unicode Properties

Unique unicode characters26
Unique unicode categories (?)5
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
i6124413.1%
 
r6114113.1%
 
a5837212.5%
 
t419729.0%
 
v285506.1%
 
g267805.7%
 
y267805.7%
 
n258225.5%
 
e217294.6%
 
s160283.4%
 
o134582.9%
 
m126012.7%
 
b123582.6%
 
96032.1%
 
/81541.7%
 
h72801.6%
 
u71161.5%
 
l61791.3%
 
d48661.0%
 
w39040.8%
 
836700.8%
 
036700.8%
 
k24980.5%
 
p24420.5%
 
f17700.4%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter44289094.6%
 
Space Separator96032.1%
 
Other Punctuation81541.7%
 
Decimal Number73401.6%
 
Dash Punctuation117< 0.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
i6124413.8%
 
r6114113.8%
 
a5837213.2%
 
t419729.5%
 
v285506.4%
 
g267806.0%
 
y267806.0%
 
n258225.8%
 
e217294.9%
 
s160283.6%
 
o134583.0%
 
m126012.8%
 
b123582.8%
 
h72801.6%
 
u71161.6%
 
l61791.4%
 
d48661.1%
 
w39040.9%
 
k24980.6%
 
p24420.6%
 
f17700.4%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
9603100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
8367050.0%
 
0367050.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
/8154100.0%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-117100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin44289094.6%
 
Common252145.4%
 

Most frequent Latin characters

ValueCountFrequency (%) 
i6124413.8%
 
r6114113.8%
 
a5837213.2%
 
t419729.5%
 
v285506.4%
 
g267806.0%
 
y267806.0%
 
n258225.8%
 
e217294.9%
 
s160283.6%
 
o134583.0%
 
m126012.8%
 
b123582.8%
 
h72801.6%
 
u71161.6%
 
l61791.4%
 
d48661.1%
 
w39040.9%
 
k24980.6%
 
p24420.6%
 
f17700.4%
 

Most frequent Common characters

ValueCountFrequency (%) 
960338.1%
 
/815432.3%
 
8367014.6%
 
0367014.6%
 
-1170.5%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII468104100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
i6124413.1%
 
r6114113.1%
 
a5837212.5%
 
t419729.0%
 
v285506.1%
 
g267805.7%
 
y267805.7%
 
n258225.5%
 
e217294.6%
 
s160283.4%
 
o134582.9%
 
m126012.7%
 
b123582.6%
 
96032.1%
 
/81541.7%
 
h72801.6%
 
u71161.5%
 
l61791.3%
 
d48661.0%
 
w39040.8%
 
836700.8%
 
036700.8%
 
k24980.5%
 
p24420.5%
 
f17700.4%
 

extraction_type_class
Categorical

HIGH CORRELATION

Distinct count7
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
gravity
26780
handpump
16456
other
6430
submersible
6179
motorpump
 
2987
Other values (2)
 
568
ValueCountFrequency (%) 
gravity2678045.1%
 
handpump1645627.7%
 
other643010.8%
 
submersible617910.4%
 
motorpump29875.0%
 
rope pump4510.8%
 
wind-powered1170.2%
 

Length

Max length12
Median length7
Mean length7.602239057
Min length5

Overview of Unicode Properties

Unique unicode characters21
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
a432369.6%
 
r429449.5%
 
p403568.9%
 
t361978.0%
 
i330767.3%
 
m290606.4%
 
g267805.9%
 
v267805.9%
 
y267805.9%
 
u260735.8%
 
h228865.1%
 
e194734.3%
 
d166903.7%
 
n165733.7%
 
o129722.9%
 
s123582.7%
 
b123582.7%
 
l61791.4%
 
4510.1%
 
w2340.1%
 
-117< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter45100599.9%
 
Space Separator4510.1%
 
Dash Punctuation117< 0.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
a432369.6%
 
r429449.5%
 
p403568.9%
 
t361978.0%
 
i330767.3%
 
m290606.4%
 
g267805.9%
 
v267805.9%
 
y267805.9%
 
u260735.8%
 
h228865.1%
 
e194734.3%
 
d166903.7%
 
n165733.7%
 
o129722.9%
 
s123582.7%
 
b123582.7%
 
l61791.4%
 
w2340.1%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-117100.0%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
451100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin45100599.9%
 
Common5680.1%
 

Most frequent Latin characters

ValueCountFrequency (%) 
a432369.6%
 
r429449.5%
 
p403568.9%
 
t361978.0%
 
i330767.3%
 
m290606.4%
 
g267805.9%
 
v267805.9%
 
y267805.9%
 
u260735.8%
 
h228865.1%
 
e194734.3%
 
d166903.7%
 
n165733.7%
 
o129722.9%
 
s123582.7%
 
b123582.7%
 
l61791.4%
 
w2340.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
45179.4%
 
-11720.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII451573100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
a432369.6%
 
r429449.5%
 
p403568.9%
 
t361978.0%
 
i330767.3%
 
m290606.4%
 
g267805.9%
 
v267805.9%
 
y267805.9%
 
u260735.8%
 
h228865.1%
 
e194734.3%
 
d166903.7%
 
n165733.7%
 
o129722.9%
 
s123582.7%
 
b123582.7%
 
l61791.4%
 
4510.1%
 
w2340.1%
 
-117< 0.1%
 

management
Categorical

HIGH CORRELATION

Distinct count12
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
vwc
40507
wug
 
6515
water board
 
2933
wua
 
2535
private operator
 
1971
Other values (7)
 
4939
ValueCountFrequency (%) 
vwc4050768.2%
 
wug651511.0%
 
water board29334.9%
 
wua25354.3%
 
private operator19713.3%
 
parastatal17683.0%
 
water authority9041.5%
 
other8441.4%
 
company6851.2%
 
unknown5610.9%
 
other - school990.2%
 
trust780.1%
 

Length

Max length16
Median length3
Mean length4.350639731
Min length3

Overview of Unicode Properties

Unique unicode characters23
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
w5395520.9%
 
v4247816.4%
 
c4129116.0%
 
a219088.5%
 
r163766.3%
 
t142225.5%
 
u105934.1%
 
o101663.9%
 
e87223.4%
 
g65152.5%
 
p63952.5%
 
60062.3%
 
b29331.1%
 
d29331.1%
 
i28751.1%
 
n23680.9%
 
h19460.8%
 
s19450.8%
 
l18670.7%
 
y15890.6%
 
m6850.3%
 
k5610.2%
 
-99< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter25232397.6%
 
Space Separator60062.3%
 
Dash Punctuation99< 0.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
w5395521.4%
 
v4247816.8%
 
c4129116.4%
 
a219088.7%
 
r163766.5%
 
t142225.6%
 
u105934.2%
 
o101664.0%
 
e87223.5%
 
g65152.6%
 
p63952.5%
 
b29331.2%
 
d29331.2%
 
i28751.1%
 
n23680.9%
 
h19460.8%
 
s19450.8%
 
l18670.7%
 
y15890.6%
 
m6850.3%
 
k5610.2%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
6006100.0%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-99100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin25232397.6%
 
Common61052.4%
 

Most frequent Latin characters

ValueCountFrequency (%) 
w5395521.4%
 
v4247816.8%
 
c4129116.4%
 
a219088.7%
 
r163766.5%
 
t142225.6%
 
u105934.2%
 
o101664.0%
 
e87223.5%
 
g65152.6%
 
p63952.5%
 
b29331.2%
 
d29331.2%
 
i28751.1%
 
n23680.9%
 
h19460.8%
 
s19450.8%
 
l18670.7%
 
y15890.6%
 
m6850.3%
 
k5610.2%
 

Most frequent Common characters

ValueCountFrequency (%) 
600698.4%
 
-991.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII258428100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
w5395520.9%
 
v4247816.4%
 
c4129116.0%
 
a219088.5%
 
r163766.3%
 
t142225.5%
 
u105934.1%
 
o101663.9%
 
e87223.4%
 
g65152.5%
 
p63952.5%
 
60062.3%
 
b29331.1%
 
d29331.1%
 
i28751.1%
 
n23680.9%
 
h19460.8%
 
s19450.8%
 
l18670.7%
 
y15890.6%
 
m6850.3%
 
k5610.2%
 
-99< 0.1%
 

management_group
Categorical

HIGH CORRELATION

Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
user-group
52490
commercial
 
3638
parastatal
 
1768
other
 
943
unknown
 
561
ValueCountFrequency (%) 
user-group5249088.4%
 
commercial36386.1%
 
parastatal17683.0%
 
other9431.6%
 
unknown5610.9%
 

Length

Max length10
Median length10
Mean length9.892289562
Min length5

Overview of Unicode Properties

Unique unicode characters18
Unique unicode categories (?)2
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
r11132918.9%
 
u10554118.0%
 
o576329.8%
 
e570719.7%
 
s542589.2%
 
p542589.2%
 
-524908.9%
 
g524908.9%
 
a107101.8%
 
c72761.2%
 
m72761.2%
 
l54060.9%
 
t44790.8%
 
i36380.6%
 
n16830.3%
 
h9430.2%
 
k5610.1%
 
w5610.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter53511291.1%
 
Dash Punctuation524908.9%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
r11132920.8%
 
u10554119.7%
 
o5763210.8%
 
e5707110.7%
 
s5425810.1%
 
p5425810.1%
 
g524909.8%
 
a107102.0%
 
c72761.4%
 
m72761.4%
 
l54061.0%
 
t44790.8%
 
i36380.7%
 
n16830.3%
 
h9430.2%
 
k5610.1%
 
w5610.1%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-52490100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin53511291.1%
 
Common524908.9%
 

Most frequent Latin characters

ValueCountFrequency (%) 
r11132920.8%
 
u10554119.7%
 
o5763210.8%
 
e5707110.7%
 
s5425810.1%
 
p5425810.1%
 
g524909.8%
 
a107102.0%
 
c72761.4%
 
m72761.4%
 
l54061.0%
 
t44790.8%
 
i36380.7%
 
n16830.3%
 
h9430.2%
 
k5610.1%
 
w5610.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
-52490100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII587602100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
r11132918.9%
 
u10554118.0%
 
o576329.8%
 
e570719.7%
 
s542589.2%
 
p542589.2%
 
-524908.9%
 
g524908.9%
 
a107101.8%
 
c72761.2%
 
m72761.2%
 
l54060.9%
 
t44790.8%
 
i36380.6%
 
n16830.3%
 
h9430.2%
 
k5610.1%
 
w5610.1%
 

payment
Categorical

HIGH CORRELATION

Distinct count7
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
never pay
25348
pay per bucket
8985
pay monthly
8300
unknown
8157
pay when scheme fails
 
3914
Other values (2)
 
4696
ValueCountFrequency (%) 
never pay2534842.7%
 
pay per bucket898515.1%
 
pay monthly830014.0%
 
unknown815713.7%
 
pay when scheme fails39146.6%
 
pay annually36426.1%
 
other10541.8%
 

Length

Max length21
Median length9
Mean length10.66479798
Min length5

Overview of Unicode Properties

Unique unicode characters21
Unique unicode categories (?)2
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e8146212.9%
 
n6931710.9%
 
6700210.6%
 
y621319.8%
 
a613879.7%
 
p591749.3%
 
r353875.6%
 
v253484.0%
 
u207843.3%
 
l194983.1%
 
t183392.9%
 
o175112.8%
 
h171822.7%
 
k171422.7%
 
c128992.0%
 
m122141.9%
 
w120711.9%
 
b89851.4%
 
s78281.2%
 
f39140.6%
 
i39140.6%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter56648789.4%
 
Space Separator6700210.6%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e8146214.4%
 
n6931712.2%
 
y6213111.0%
 
a6138710.8%
 
p5917410.4%
 
r353876.2%
 
v253484.5%
 
u207843.7%
 
l194983.4%
 
t183393.2%
 
o175113.1%
 
h171823.0%
 
k171423.0%
 
c128992.3%
 
m122142.2%
 
w120712.1%
 
b89851.6%
 
s78281.4%
 
f39140.7%
 
i39140.7%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
67002100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin56648789.4%
 
Common6700210.6%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e8146214.4%
 
n6931712.2%
 
y6213111.0%
 
a6138710.8%
 
p5917410.4%
 
r353876.2%
 
v253484.5%
 
u207843.7%
 
l194983.4%
 
t183393.2%
 
o175113.1%
 
h171823.0%
 
k171423.0%
 
c128992.3%
 
m122142.2%
 
w120712.1%
 
b89851.6%
 
s78281.4%
 
f39140.7%
 
i39140.7%
 

Most frequent Common characters

ValueCountFrequency (%) 
67002100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII633489100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e8146212.9%
 
n6931710.9%
 
6700210.6%
 
y621319.8%
 
a613879.7%
 
p591749.3%
 
r353875.6%
 
v253484.0%
 
u207843.3%
 
l194983.1%
 
t183392.9%
 
o175112.8%
 
h171822.7%
 
k171422.7%
 
c128992.0%
 
m122141.9%
 
w120711.9%
 
b89851.4%
 
s78281.2%
 
f39140.6%
 
i39140.6%
 

payment_type
Categorical

HIGH CORRELATION

Distinct count7
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
never pay
25348
per bucket
8985
monthly
8300
unknown
8157
on failure
 
3914
Other values (2)
 
4696
ValueCountFrequency (%) 
never pay2534842.7%
 
per bucket898515.1%
 
monthly830014.0%
 
unknown815713.7%
 
on failure39146.6%
 
annually36426.1%
 
other10541.8%
 

Length

Max length10
Median length9
Mean length8.530757576
Min length5

Overview of Unicode Properties

Unique unicode characters20
Unique unicode categories (?)2
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e7363414.5%
 
n6931713.7%
 
r393017.8%
 
382477.5%
 
y372907.4%
 
a365467.2%
 
p343336.8%
 
v253485.0%
 
u246984.9%
 
o214254.2%
 
l194983.8%
 
t183393.6%
 
k171423.4%
 
h93541.8%
 
b89851.8%
 
c89851.8%
 
m83001.6%
 
w81571.6%
 
f39140.8%
 
i39140.8%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter46848092.5%
 
Space Separator382477.5%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e7363415.7%
 
n6931714.8%
 
r393018.4%
 
y372908.0%
 
a365467.8%
 
p343337.3%
 
v253485.4%
 
u246985.3%
 
o214254.6%
 
l194984.2%
 
t183393.9%
 
k171423.7%
 
h93542.0%
 
b89851.9%
 
c89851.9%
 
m83001.8%
 
w81571.7%
 
f39140.8%
 
i39140.8%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
38247100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin46848092.5%
 
Common382477.5%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e7363415.7%
 
n6931714.8%
 
r393018.4%
 
y372908.0%
 
a365467.8%
 
p343337.3%
 
v253485.4%
 
u246985.3%
 
o214254.6%
 
l194984.2%
 
t183393.9%
 
k171423.7%
 
h93542.0%
 
b89851.9%
 
c89851.9%
 
m83001.8%
 
w81571.7%
 
f39140.8%
 
i39140.8%
 

Most frequent Common characters

ValueCountFrequency (%) 
38247100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII506727100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e7363414.5%
 
n6931713.7%
 
r393017.8%
 
382477.5%
 
y372907.4%
 
a365467.2%
 
p343336.8%
 
v253485.0%
 
u246984.9%
 
o214254.2%
 
l194983.8%
 
t183393.6%
 
k171423.4%
 
h93541.8%
 
b89851.8%
 
c89851.8%
 
m83001.6%
 
w81571.6%
 
f39140.8%
 
i39140.8%
 

water_quality
Categorical

HIGH CORRELATION

Distinct count8
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
soft
50818
salty
 
4856
unknown
 
1876
milky
 
804
coloured
 
490
Other values (3)
 
556
ValueCountFrequency (%) 
soft5081885.6%
 
salty48568.2%
 
unknown18763.2%
 
milky8041.4%
 
coloured4900.8%
 
salty abandoned3390.6%
 
fluoride2000.3%
 
fluoride abandoned17< 0.1%
 

Length

Max length18
Median length4
Mean length4.303282828
Min length4

Overview of Unicode Properties

Unique unicode characters19
Unique unicode categories (?)2
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
s5601321.9%
 
t5601321.9%
 
o5424721.2%
 
f5103520.0%
 
l67062.6%
 
n63402.5%
 
y59992.3%
 
a59072.3%
 
k26801.0%
 
u25831.0%
 
w18760.7%
 
d14190.6%
 
e10630.4%
 
i10210.4%
 
m8040.3%
 
r7070.3%
 
c4900.2%
 
3560.1%
 
b3560.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter25525999.9%
 
Space Separator3560.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
s5601321.9%
 
t5601321.9%
 
o5424721.3%
 
f5103520.0%
 
l67062.6%
 
n63402.5%
 
y59992.4%
 
a59072.3%
 
k26801.0%
 
u25831.0%
 
w18760.7%
 
d14190.6%
 
e10630.4%
 
i10210.4%
 
m8040.3%
 
r7070.3%
 
c4900.2%
 
b3560.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
356100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin25525999.9%
 
Common3560.1%
 

Most frequent Latin characters

ValueCountFrequency (%) 
s5601321.9%
 
t5601321.9%
 
o5424721.3%
 
f5103520.0%
 
l67062.6%
 
n63402.5%
 
y59992.4%
 
a59072.3%
 
k26801.0%
 
u25831.0%
 
w18760.7%
 
d14190.6%
 
e10630.4%
 
i10210.4%
 
m8040.3%
 
r7070.3%
 
c4900.2%
 
b3560.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
356100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII255615100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
s5601321.9%
 
t5601321.9%
 
o5424721.2%
 
f5103520.0%
 
l67062.6%
 
n63402.5%
 
y59992.3%
 
a59072.3%
 
k26801.0%
 
u25831.0%
 
w18760.7%
 
d14190.6%
 
e10630.4%
 
i10210.4%
 
m8040.3%
 
r7070.3%
 
c4900.2%
 
3560.1%
 
b3560.1%
 

quality_group
Categorical

HIGH CORRELATION

Distinct count6
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
good
50818
salty
 
5195
unknown
 
1876
milky
 
804
colored
 
490
ValueCountFrequency (%) 
good5081885.6%
 
salty51958.7%
 
unknown18763.2%
 
milky8041.4%
 
colored4900.8%
 
fluoride2170.4%
 

Length

Max length8
Median length4
Mean length4.23510101
Min length4

Overview of Unicode Properties

Unique unicode characters18
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
o10470941.6%
 
d5152520.5%
 
g5081820.2%
 
l67062.7%
 
y59992.4%
 
n56282.2%
 
s51952.1%
 
a51952.1%
 
t51952.1%
 
k26801.1%
 
u20930.8%
 
w18760.7%
 
i10210.4%
 
m8040.3%
 
r7070.3%
 
e7070.3%
 
c4900.2%
 
f2170.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter251565100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
o10470941.6%
 
d5152520.5%
 
g5081820.2%
 
l67062.7%
 
y59992.4%
 
n56282.2%
 
s51952.1%
 
a51952.1%
 
t51952.1%
 
k26801.1%
 
u20930.8%
 
w18760.7%
 
i10210.4%
 
m8040.3%
 
r7070.3%
 
e7070.3%
 
c4900.2%
 
f2170.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin251565100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
o10470941.6%
 
d5152520.5%
 
g5081820.2%
 
l67062.7%
 
y59992.4%
 
n56282.2%
 
s51952.1%
 
a51952.1%
 
t51952.1%
 
k26801.1%
 
u20930.8%
 
w18760.7%
 
i10210.4%
 
m8040.3%
 
r7070.3%
 
e7070.3%
 
c4900.2%
 
f2170.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII251565100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
o10470941.6%
 
d5152520.5%
 
g5081820.2%
 
l67062.7%
 
y59992.4%
 
n56282.2%
 
s51952.1%
 
a51952.1%
 
t51952.1%
 
k26801.1%
 
u20930.8%
 
w18760.7%
 
i10210.4%
 
m8040.3%
 
r7070.3%
 
e7070.3%
 
c4900.2%
 
f2170.1%
 

quantity
Categorical

HIGH CORRELATION

Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
enough
33186
insufficient
15129
dry
 
6246
seasonal
 
4050
unknown
 
789
ValueCountFrequency (%) 
enough3318655.9%
 
insufficient1512925.5%
 
dry624610.5%
 
seasonal40506.8%
 
unknown7891.3%
 

Length

Max length12
Median length6
Mean length7.362373737
Min length3

Overview of Unicode Properties

Unique unicode characters18
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
n6986116.0%
 
e5236512.0%
 
u4910411.2%
 
i4538710.4%
 
o380258.7%
 
g331867.6%
 
h331867.6%
 
f302586.9%
 
s232295.3%
 
c151293.5%
 
t151293.5%
 
a81001.9%
 
d62461.4%
 
r62461.4%
 
y62461.4%
 
l40500.9%
 
k7890.2%
 
w7890.2%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter437325100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n6986116.0%
 
e5236512.0%
 
u4910411.2%
 
i4538710.4%
 
o380258.7%
 
g331867.6%
 
h331867.6%
 
f302586.9%
 
s232295.3%
 
c151293.5%
 
t151293.5%
 
a81001.9%
 
d62461.4%
 
r62461.4%
 
y62461.4%
 
l40500.9%
 
k7890.2%
 
w7890.2%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin437325100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
n6986116.0%
 
e5236512.0%
 
u4910411.2%
 
i4538710.4%
 
o380258.7%
 
g331867.6%
 
h331867.6%
 
f302586.9%
 
s232295.3%
 
c151293.5%
 
t151293.5%
 
a81001.9%
 
d62461.4%
 
r62461.4%
 
y62461.4%
 
l40500.9%
 
k7890.2%
 
w7890.2%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII437325100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
n6986116.0%
 
e5236512.0%
 
u4910411.2%
 
i4538710.4%
 
o380258.7%
 
g331867.6%
 
h331867.6%
 
f302586.9%
 
s232295.3%
 
c151293.5%
 
t151293.5%
 
a81001.9%
 
d62461.4%
 
r62461.4%
 
y62461.4%
 
l40500.9%
 
k7890.2%
 
w7890.2%
 

quantity_group
Categorical

HIGH CORRELATION

Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
enough
33186
insufficient
15129
dry
 
6246
seasonal
 
4050
unknown
 
789
ValueCountFrequency (%) 
enough3318655.9%
 
insufficient1512925.5%
 
dry624610.5%
 
seasonal40506.8%
 
unknown7891.3%
 

Length

Max length12
Median length6
Mean length7.362373737
Min length3

Overview of Unicode Properties

Unique unicode characters18
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
n6986116.0%
 
e5236512.0%
 
u4910411.2%
 
i4538710.4%
 
o380258.7%
 
g331867.6%
 
h331867.6%
 
f302586.9%
 
s232295.3%
 
c151293.5%
 
t151293.5%
 
a81001.9%
 
d62461.4%
 
r62461.4%
 
y62461.4%
 
l40500.9%
 
k7890.2%
 
w7890.2%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter437325100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n6986116.0%
 
e5236512.0%
 
u4910411.2%
 
i4538710.4%
 
o380258.7%
 
g331867.6%
 
h331867.6%
 
f302586.9%
 
s232295.3%
 
c151293.5%
 
t151293.5%
 
a81001.9%
 
d62461.4%
 
r62461.4%
 
y62461.4%
 
l40500.9%
 
k7890.2%
 
w7890.2%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin437325100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
n6986116.0%
 
e5236512.0%
 
u4910411.2%
 
i4538710.4%
 
o380258.7%
 
g331867.6%
 
h331867.6%
 
f302586.9%
 
s232295.3%
 
c151293.5%
 
t151293.5%
 
a81001.9%
 
d62461.4%
 
r62461.4%
 
y62461.4%
 
l40500.9%
 
k7890.2%
 
w7890.2%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII437325100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
n6986116.0%
 
e5236512.0%
 
u4910411.2%
 
i4538710.4%
 
o380258.7%
 
g331867.6%
 
h331867.6%
 
f302586.9%
 
s232295.3%
 
c151293.5%
 
t151293.5%
 
a81001.9%
 
d62461.4%
 
r62461.4%
 
y62461.4%
 
l40500.9%
 
k7890.2%
 
w7890.2%
 

source
Categorical

HIGH CORRELATION

Distinct count10
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
spring
17021
shallow well
16824
machine dbh
11075
river
9612
rainwater harvesting
 
2295
Other values (5)
 
2573
ValueCountFrequency (%) 
spring1702128.7%
 
shallow well1682428.3%
 
machine dbh1107518.6%
 
river961216.2%
 
rainwater harvesting22953.9%
 
hand dtw8741.5%
 
lake7651.3%
 
dam6561.1%
 
other2120.4%
 
unknown660.1%
 

Length

Max length20
Median length11
Mean length8.978804714
Min length3

Overview of Unicode Properties

Unique unicode characters21
Unique unicode categories (?)2
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
l6806112.8%
 
r433428.1%
 
e430788.1%
 
h423557.9%
 
i422987.9%
 
a370797.0%
 
w368836.9%
 
s361406.8%
 
n337586.3%
 
310685.8%
 
g193163.6%
 
o171023.2%
 
p170213.2%
 
d134792.5%
 
v119072.2%
 
m117312.2%
 
c110752.1%
 
b110752.1%
 
t56761.1%
 
k8310.2%
 
u66< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter50227394.2%
 
Space Separator310685.8%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
l6806113.6%
 
r433428.6%
 
e430788.6%
 
h423558.4%
 
i422988.4%
 
a370797.4%
 
w368837.3%
 
s361407.2%
 
n337586.7%
 
g193163.8%
 
o171023.4%
 
p170213.4%
 
d134792.7%
 
v119072.4%
 
m117312.3%
 
c110752.2%
 
b110752.2%
 
t56761.1%
 
k8310.2%
 
u66< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
31068100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin50227394.2%
 
Common310685.8%
 

Most frequent Latin characters

ValueCountFrequency (%) 
l6806113.6%
 
r433428.6%
 
e430788.6%
 
h423558.4%
 
i422988.4%
 
a370797.4%
 
w368837.3%
 
s361407.2%
 
n337586.7%
 
g193163.8%
 
o171023.4%
 
p170213.4%
 
d134792.7%
 
v119072.4%
 
m117312.3%
 
c110752.2%
 
b110752.2%
 
t56761.1%
 
k8310.2%
 
u66< 0.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
31068100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII533341100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
l6806112.8%
 
r433428.1%
 
e430788.1%
 
h423557.9%
 
i422987.9%
 
a370797.0%
 
w368836.9%
 
s361406.8%
 
n337586.3%
 
310685.8%
 
g193163.6%
 
o171023.2%
 
p170213.2%
 
d134792.5%
 
v119072.2%
 
m117312.2%
 
c110752.1%
 
b110752.1%
 
t56761.1%
 
k8310.2%
 
u66< 0.1%
 

source_type
Categorical

HIGH CORRELATION

Distinct count7
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
spring
17021
shallow well
16824
borehole
11949
river/lake
10377
rainwater harvesting
 
2295
Other values (2)
 
934
ValueCountFrequency (%) 
spring1702128.7%
 
shallow well1682428.3%
 
borehole1194920.1%
 
river/lake1037717.5%
 
rainwater harvesting22953.9%
 
dam6561.1%
 
other2780.5%
 

Length

Max length20
Median length8
Mean length9.303602694
Min length3

Overview of Unicode Properties

Unique unicode characters20
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
l8962216.2%
 
e6634412.0%
 
r5688710.3%
 
o410007.4%
 
s361406.5%
 
w359436.5%
 
a347426.3%
 
i319885.8%
 
h313465.7%
 
n216113.9%
 
g193163.5%
 
191193.5%
 
p170213.1%
 
v126722.3%
 
b119492.2%
 
/103771.9%
 
k103771.9%
 
t48680.9%
 
d6560.1%
 
m6560.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter52313894.7%
 
Space Separator191193.5%
 
Other Punctuation103771.9%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
l8962217.1%
 
e6634412.7%
 
r5688710.9%
 
o410007.8%
 
s361406.9%
 
w359436.9%
 
a347426.6%
 
i319886.1%
 
h313466.0%
 
n216114.1%
 
g193163.7%
 
p170213.3%
 
v126722.4%
 
b119492.3%
 
k103772.0%
 
t48680.9%
 
d6560.1%
 
m6560.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
19119100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
/10377100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin52313894.7%
 
Common294965.3%
 

Most frequent Latin characters

ValueCountFrequency (%) 
l8962217.1%
 
e6634412.7%
 
r5688710.9%
 
o410007.8%
 
s361406.9%
 
w359436.9%
 
a347426.6%
 
i319886.1%
 
h313466.0%
 
n216114.1%
 
g193163.7%
 
p170213.3%
 
v126722.4%
 
b119492.3%
 
k103772.0%
 
t48680.9%
 
d6560.1%
 
m6560.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
1911964.8%
 
/1037735.2%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII552634100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
l8962216.2%
 
e6634412.0%
 
r5688710.3%
 
o410007.4%
 
s361406.5%
 
w359436.5%
 
a347426.3%
 
i319885.8%
 
h313465.7%
 
n216113.9%
 
g193163.5%
 
191193.5%
 
p170213.1%
 
v126722.3%
 
b119492.2%
 
/103771.9%
 
k103771.9%
 
t48680.9%
 
d6560.1%
 
m6560.1%
 

source_class
Categorical

HIGH CORRELATION

Distinct count3
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
groundwater
45794
surface
13328
unknown
 
278
ValueCountFrequency (%) 
groundwater4579477.1%
 
surface1332822.4%
 
unknown2780.5%
 

Length

Max length11
Median length11
Mean length10.08377104
Min length7

Overview of Unicode Properties

Unique unicode characters14
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
r10491617.5%
 
u594009.9%
 
a591229.9%
 
e591229.9%
 
n466287.8%
 
o460727.7%
 
w460727.7%
 
g457947.6%
 
d457947.6%
 
t457947.6%
 
s133282.2%
 
f133282.2%
 
c133282.2%
 
k278< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter598976100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
r10491617.5%
 
u594009.9%
 
a591229.9%
 
e591229.9%
 
n466287.8%
 
o460727.7%
 
w460727.7%
 
g457947.6%
 
d457947.6%
 
t457947.6%
 
s133282.2%
 
f133282.2%
 
c133282.2%
 
k278< 0.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin598976100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
r10491617.5%
 
u594009.9%
 
a591229.9%
 
e591229.9%
 
n466287.8%
 
o460727.7%
 
w460727.7%
 
g457947.6%
 
d457947.6%
 
t457947.6%
 
s133282.2%
 
f133282.2%
 
c133282.2%
 
k278< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII598976100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
r10491617.5%
 
u594009.9%
 
a591229.9%
 
e591229.9%
 
n466287.8%
 
o460727.7%
 
w460727.7%
 
g457947.6%
 
d457947.6%
 
t457947.6%
 
s133282.2%
 
f133282.2%
 
c133282.2%
 
k278< 0.1%
 

waterpoint_type
Categorical

HIGH CORRELATION

Distinct count7
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
communal standpipe
28522
hand pump
17488
other
6380
communal standpipe multiple
6103
improved spring
 
784
Other values (2)
 
123
ValueCountFrequency (%) 
communal standpipe2852248.0%
 
hand pump1748829.4%
 
other638010.7%
 
communal standpipe multiple610310.3%
 
improved spring7841.3%
 
cattle trough1160.2%
 
dam7< 0.1%
 

Length

Max length27
Median length18
Mean length14.82757576
Min length3

Overview of Unicode Properties

Unique unicode characters18
Unique unicode categories (?)2
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
p11189712.7%
 
m9363210.6%
 
n875229.9%
 
a868619.9%
 
591166.7%
 
u583326.6%
 
d529046.0%
 
e480085.5%
 
t474565.4%
 
l469475.3%
 
i422964.8%
 
o419054.8%
 
s354094.0%
 
c347413.9%
 
h239842.7%
 
r80640.9%
 
g9000.1%
 
v7840.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter82164293.3%
 
Space Separator591166.7%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
p11189713.6%
 
m9363211.4%
 
n8752210.7%
 
a8686110.6%
 
u583327.1%
 
d529046.4%
 
e480085.8%
 
t474565.8%
 
l469475.7%
 
i422965.1%
 
o419055.1%
 
s354094.3%
 
c347414.2%
 
h239842.9%
 
r80641.0%
 
g9000.1%
 
v7840.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
59116100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin82164293.3%
 
Common591166.7%
 

Most frequent Latin characters

ValueCountFrequency (%) 
p11189713.6%
 
m9363211.4%
 
n8752210.7%
 
a8686110.6%
 
u583327.1%
 
d529046.4%
 
e480085.8%
 
t474565.8%
 
l469475.7%
 
i422965.1%
 
o419055.1%
 
s354094.3%
 
c347414.2%
 
h239842.9%
 
r80641.0%
 
g9000.1%
 
v7840.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
59116100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII880758100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
p11189712.7%
 
m9363210.6%
 
n875229.9%
 
a868619.9%
 
591166.7%
 
u583326.6%
 
d529046.0%
 
e480085.5%
 
t474565.4%
 
l469475.3%
 
i422964.8%
 
o419054.8%
 
s354094.0%
 
c347413.9%
 
h239842.7%
 
r80640.9%
 
g9000.1%
 
v7840.1%
 

waterpoint_type_group
Categorical

HIGH CORRELATION

Distinct count6
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
communal standpipe
34625
hand pump
17488
other
 
6380
improved spring
 
784
cattle trough
 
116
ValueCountFrequency (%) 
communal standpipe3462558.3%
 
hand pump1748829.4%
 
other638010.7%
 
improved spring7841.3%
 
cattle trough1160.2%
 
dam7< 0.1%
 

Length

Max length18
Median length18
Mean length13.90287879
Min length3

Overview of Unicode Properties

Unique unicode characters18
Unique unicode categories (?)2
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
p10579412.8%
 
m8752910.6%
 
n8752210.6%
 
a8686110.5%
 
530136.4%
 
d529046.4%
 
u522296.3%
 
o419055.1%
 
e419055.1%
 
t413535.0%
 
i361934.4%
 
s354094.3%
 
c347414.2%
 
l347414.2%
 
h239842.9%
 
r80641.0%
 
g9000.1%
 
v7840.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter77281893.6%
 
Space Separator530136.4%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
p10579413.7%
 
m8752911.3%
 
n8752211.3%
 
a8686111.2%
 
d529046.8%
 
u522296.8%
 
o419055.4%
 
e419055.4%
 
t413535.4%
 
i361934.7%
 
s354094.6%
 
c347414.5%
 
l347414.5%
 
h239843.1%
 
r80641.0%
 
g9000.1%
 
v7840.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
53013100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin77281893.6%
 
Common530136.4%
 

Most frequent Latin characters

ValueCountFrequency (%) 
p10579413.7%
 
m8752911.3%
 
n8752211.3%
 
a8686111.2%
 
d529046.8%
 
u522296.8%
 
o419055.4%
 
e419055.4%
 
t413535.4%
 
i361934.7%
 
s354094.6%
 
c347414.5%
 
l347414.5%
 
h239843.1%
 
r80641.0%
 
g9000.1%
 
v7840.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
53013100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII825831100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
p10579412.8%
 
m8752910.6%
 
n8752210.6%
 
a8686110.5%
 
530136.4%
 
d529046.4%
 
u522296.3%
 
o419055.1%
 
e419055.1%
 
t413535.0%
 
i361934.4%
 
s354094.3%
 
c347414.2%
 
l347414.2%
 
h239842.9%
 
r80641.0%
 
g9000.1%
 
v7840.1%
 

status_group
Categorical

Distinct count3
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size464.2 KiB
functional
32259
non functional
22824
functional needs repair
 
4317
ValueCountFrequency (%) 
functional3225954.3%
 
non functional2282438.4%
 
functional needs repair43177.3%
 

Length

Max length23
Median length10
Mean length12.48176768
Min length10

Overview of Unicode Properties

Unique unicode characters15
Unique unicode categories (?)2
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
n16876522.8%
 
o8222411.1%
 
i637178.6%
 
a637178.6%
 
f594008.0%
 
u594008.0%
 
c594008.0%
 
t594008.0%
 
l594008.0%
 
314584.2%
 
e129511.7%
 
r86341.2%
 
d43170.6%
 
s43170.6%
 
p43170.6%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter70995995.8%
 
Space Separator314584.2%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n16876523.8%
 
o8222411.6%
 
i637179.0%
 
a637179.0%
 
f594008.4%
 
u594008.4%
 
c594008.4%
 
t594008.4%
 
l594008.4%
 
e129511.8%
 
r86341.2%
 
d43170.6%
 
s43170.6%
 
p43170.6%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
31458100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin70995995.8%
 
Common314584.2%
 

Most frequent Latin characters

ValueCountFrequency (%) 
n16876523.8%
 
o8222411.6%
 
i637179.0%
 
a637179.0%
 
f594008.4%
 
u594008.4%
 
c594008.4%
 
t594008.4%
 
l594008.4%
 
e129511.8%
 
r86341.2%
 
d43170.6%
 
s43170.6%
 
p43170.6%
 

Most frequent Common characters

ValueCountFrequency (%) 
31458100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII741417100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
n16876522.8%
 
o8222411.1%
 
i637178.6%
 
a637178.6%
 
f594008.0%
 
u594008.0%
 
c594008.0%
 
t594008.0%
 
l594008.0%
 
314584.2%
 
e129511.7%
 
r86341.2%
 
d43170.6%
 
s43170.6%
 
p43170.6%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

idamount_tshdate_recordedfundergps_heightinstallerlongitudelatitudewpt_namenum_privatebasinsubvillageregionregion_codedistrict_codelgawardpopulationpublic_meetingrecorded_byscheme_managementscheme_namepermitconstruction_yearextraction_typeextraction_type_groupextraction_type_classmanagementmanagement_grouppaymentpayment_typewater_qualityquality_groupquantityquantity_groupsourcesource_typesource_classwaterpoint_typewaterpoint_type_groupstatus_group
0695726000.02011-03-14Roman1390Roman34.938093-9.856322none0Lake NyasaMnyusi BIringa115LudewaMundindi109TrueGeoData Consultants LtdVWCRomanFalse1999gravitygravitygravityvwcuser-grouppay annuallyannuallysoftgoodenoughenoughspringspringgroundwatercommunal standpipecommunal standpipefunctional
187760.02013-03-06Grumeti1399GRUMETI34.698766-2.147466Zahanati0Lake VictoriaNyamaraMara202SerengetiNatta280NaNGeoData Consultants LtdOtherNaNTrue2010gravitygravitygravitywuguser-groupnever paynever paysoftgoodinsufficientinsufficientrainwater harvestingrainwater harvestingsurfacecommunal standpipecommunal standpipefunctional
23431025.02013-02-25Lottery Club686World vision37.460664-3.821329Kwa Mahundi0PanganiMajengoManyara214SimanjiroNgorika250TrueGeoData Consultants LtdVWCNyumba ya mungu pipe schemeTrue2009gravitygravitygravityvwcuser-grouppay per bucketper bucketsoftgoodenoughenoughdamdamsurfacecommunal standpipe multiplecommunal standpipefunctional
3677430.02013-01-28Unicef263UNICEF38.486161-11.155298Zahanati Ya Nanyumbu0Ruvuma / Southern CoastMahakamaniMtwara9063NanyumbuNanyumbu58TrueGeoData Consultants LtdVWCNaNTrue1986submersiblesubmersiblesubmersiblevwcuser-groupnever paynever paysoftgooddrydrymachine dbhboreholegroundwatercommunal standpipe multiplecommunal standpipenon functional
4197280.02011-07-13Action In A0Artisan31.130847-1.825359Shuleni0Lake VictoriaKyanyamisaKagera181KaragweNyakasimbi0TrueGeoData Consultants LtdNaNNaNTrue0gravitygravitygravityotherothernever paynever paysoftgoodseasonalseasonalrainwater harvestingrainwater harvestingsurfacecommunal standpipecommunal standpipefunctional
5994420.02011-03-13Mkinga Distric Coun0DWE39.172796-4.765587Tajiri0PanganiMoa/MweremeTanga48MkingaMoa1TrueGeoData Consultants LtdVWCZingibaliTrue2009submersiblesubmersiblesubmersiblevwcuser-grouppay per bucketper bucketsaltysaltyenoughenoughotherotherunknowncommunal standpipe multiplecommunal standpipefunctional
6198160.02012-10-01Dwsp0DWSP33.362410-3.766365Kwa Ngomho0InternalIshinabulandiShinyanga173Shinyanga RuralSamuye0TrueGeoData Consultants LtdVWCNaNTrue0swn 80swn 80handpumpvwcuser-groupnever paynever paysoftgoodenoughenoughmachine dbhboreholegroundwaterhand pumphand pumpnon functional
7545510.02012-10-09Rwssp0DWE32.620617-4.226198Tushirikiane0Lake TanganyikaNyawishi CenterShinyanga173KahamaChambo0TrueGeoData Consultants LtdNaNNaNTrue0nira/taniranira/tanirahandpumpwuguser-groupunknownunknownmilkymilkyenoughenoughshallow wellshallow wellgroundwaterhand pumphand pumpnon functional
8539340.02012-11-03Wateraid0Water Aid32.711100-5.146712Kwa Ramadhan Musa0Lake TanganyikaImalaudukiTabora146Tabora UrbanItetemia0TrueGeoData Consultants LtdVWCNaNTrue0india mark iiindia mark iihandpumpvwcuser-groupnever paynever paysaltysaltyseasonalseasonalmachine dbhboreholegroundwaterhand pumphand pumpnon functional
9461440.02011-08-03Isingiro Ho0Artisan30.626991-1.257051Kwapeto0Lake VictoriaMkonomreKagera181KaragweKaisho0TrueGeoData Consultants LtdNaNNaNTrue0nira/taniranira/tanirahandpumpvwcuser-groupnever paynever paysoftgoodenoughenoughshallow wellshallow wellgroundwaterhand pumphand pumpfunctional

Last rows

idamount_tshdate_recordedfundergps_heightinstallerlongitudelatitudewpt_namenum_privatebasinsubvillageregionregion_codedistrict_codelgawardpopulationpublic_meetingrecorded_byscheme_managementscheme_namepermitconstruction_yearextraction_typeextraction_type_groupextraction_type_classmanagementmanagement_grouppaymentpayment_typewater_qualityquality_groupquantityquantity_groupsourcesource_typesource_classwaterpoint_typewaterpoint_type_groupstatus_group
59390136770.02011-08-04Rudep1715DWE31.370848-8.258160Kwa Mzee Atanas0Lake TanganyikaKitontoRukwa152Sumbawanga RuralMkowe150TrueGeoData Consultants LtdVWCNaNFalse1991swn 80swn 80handpumpvwcuser-groupnever paynever paysoftgoodinsufficientinsufficientmachine dbhboreholegroundwaterhand pumphand pumpfunctional
59391448850.02013-08-03Government Of Tanzania540Government38.044070-4.272218Kwa0PanganiMaore KatiKilimanjaro33SameMaore210TrueGeoData Consultants LtdWater authorityHingililiTrue1967gravitygravitygravityvwcuser-groupnever paynever paysoftgoodenoughenoughriverriver/lakesurfacecommunal standpipecommunal standpipenon functional
59392406070.02011-04-15Government Of Tanzania0Government33.009440-8.520888Benard Charles0Lake RukwaMbuyuni AMbeya121ChunyaMbuyuni0TrueGeoData Consultants LtdVWCNaNTrue0gravitygravitygravityvwcuser-groupnever paynever paysoftgoodenoughenoughspringspringgroundwatercommunal standpipecommunal standpipenon functional
59393483480.02012-10-27Private0Private33.866852-4.287410Kwa Peter0InternalMasangaTabora142IgungaIgunga0FalseGeoData Consultants LtdWater authorityNaNFalse0gravitygravitygravityprivate operatorcommercialpay per bucketper bucketsoftgoodinsufficientinsufficientdamdamsurfaceotherotherfunctional
5939411164500.02011-03-09World Bank351ML appro37.634053-6.124830Chimeredya0Wami / RuvuKomstariMorogoro56MvomeroDiongoya89TrueGeoData Consultants LtdVWCNaNTrue2007submersiblesubmersiblesubmersiblevwcuser-grouppay monthlymonthlysoftgoodenoughenoughmachine dbhboreholegroundwatercommunal standpipecommunal standpipenon functional
593956073910.02013-05-03Germany Republi1210CES37.169807-3.253847Area Three Namba 270PanganiKiduruniKilimanjaro35HaiMasama Magharibi125TrueGeoData Consultants LtdWater BoardLosaa Kia water supplyTrue1999gravitygravitygravitywater boarduser-grouppay per bucketper bucketsoftgoodenoughenoughspringspringgroundwatercommunal standpipecommunal standpipefunctional
59396272634700.02011-05-07Cefa-njombe1212Cefa35.249991-9.070629Kwa Yahona Kuvala0RufijiIgumbiloIringa114NjombeIkondo56TrueGeoData Consultants LtdVWCIkondo electrical water schTrue1996gravitygravitygravityvwcuser-grouppay annuallyannuallysoftgoodenoughenoughriverriver/lakesurfacecommunal standpipecommunal standpipefunctional
59397370570.02011-04-11NaN0NaN34.017087-8.750434Mashine0RufijiMadunguluMbeya127MbaraliChimala0TrueGeoData Consultants LtdVWCNaNFalse0swn 80swn 80handpumpvwcuser-grouppay monthlymonthlyfluoridefluorideenoughenoughmachine dbhboreholegroundwaterhand pumphand pumpfunctional
59398312820.02011-03-08Malec0Musa35.861315-6.378573Mshoro0RufijiMwinyiDodoma14ChamwinoMvumi Makulu0TrueGeoData Consultants LtdVWCNaNTrue0nira/taniranira/tanirahandpumpvwcuser-groupnever paynever paysoftgoodinsufficientinsufficientshallow wellshallow wellgroundwaterhand pumphand pumpfunctional
59399263480.02011-03-23World Bank191World38.104048-6.747464Kwa Mzee Lugawa0Wami / RuvuKikatanyembaMorogoro52Morogoro RuralNgerengere150TrueGeoData Consultants LtdVWCNaNTrue2002nira/taniranira/tanirahandpumpvwcuser-grouppay when scheme failson failuresaltysaltyenoughenoughshallow wellshallow wellgroundwaterhand pumphand pumpfunctional